date:20130903

[Qemu-devel] [PATCH v2] exec: do tcg_commit only when tcg_enabled

2013-09-03 Thread liguang

Signed-off-by: liguang 
---
 exec.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/exec.c b/exec.c
index 3ca9381..2170295 100644
--- a/exec.c
+++ b/exec.c
@@ -1824,7 +1824,9 @@ static void memory_map_init(void)
 address_space_init(&address_space_io, system_io, "I/O");
 
 memory_listener_register(&core_memory_listener, &address_space_memory);
-memory_listener_register(&tcg_memory_listener, &address_space_memory);
+if (tcg_enabled()) {
+memory_listener_register(&tcg_memory_listener, &address_space_memory);
+}
 }
 
 MemoryRegion *get_system_memory(void)
-- 
1.7.2.5

Re: [Qemu-devel] [PATCH] exec: avoid tcg_commit when kvm_enabled

2013-09-03 Thread Li Guang

在 2013-09-04三的 08:23 +0200，Paolo Bonzini写道：
> Il 04/09/2013 03:07, Li Guang ha scritto:
> > 在 2013-09-03二的 10:39 +0200，Andreas Färber写道：
> >> Am 03.09.2013 08:59, schrieb liguang:
> >>> Signed-off-by: liguang 
> >>> ---
> >>>  exec.c |4 +++-
> >>>  1 files changed, 3 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/exec.c b/exec.c
> >>> index 3ca9381..4509daa 100644
> >>> --- a/exec.c
> >>> +++ b/exec.c
> >>> @@ -1824,7 +1824,9 @@ static void memory_map_init(void)
> >>>  address_space_init(&address_space_io, system_io, "I/O");
> >>>  
> >>>  memory_listener_register(&core_memory_listener, 
> >>> &address_space_memory);
> >>> -memory_listener_register(&tcg_memory_listener, 
> >>> &address_space_memory);
> >>> +if (!kvm_enabled()) {
> >>
> >> if (tcg_enabled())? I'm guessing Xen and QTest don't need it either?
> >>
> > 
> > can't assure that currently, 
> > anybody can help to assure whether Xen & QTest need tcg_commit?
> 

OK, Thanks!

> 
> >>
> >>> +memory_listener_register(&tcg_memory_listener, 
> >>> &address_space_memory);
> >>> +}
> >>>  }
> >>>  
> >>>  MemoryRegion *get_system_memory(void)
> >>
> > 
>

Re: [Qemu-devel] [PATCH] exec: avoid tcg_commit when kvm_enabled

2013-09-03 Thread Paolo Bonzini

Il 04/09/2013 03:07, Li Guang ha scritto:
> 在 2013-09-03二的 10:39 +0200，Andreas Färber写道：
>> Am 03.09.2013 08:59, schrieb liguang:
>>> Signed-off-by: liguang 
>>> ---
>>>  exec.c |4 +++-
>>>  1 files changed, 3 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/exec.c b/exec.c
>>> index 3ca9381..4509daa 100644
>>> --- a/exec.c
>>> +++ b/exec.c
>>> @@ -1824,7 +1824,9 @@ static void memory_map_init(void)
>>>  address_space_init(&address_space_io, system_io, "I/O");
>>>  
>>>  memory_listener_register(&core_memory_listener, &address_space_memory);
>>> -memory_listener_register(&tcg_memory_listener, &address_space_memory);
>>> +if (!kvm_enabled()) {
>>
>> if (tcg_enabled())? I'm guessing Xen and QTest don't need it either?
>>
> 
> can't assure that currently, 
> anybody can help to assure whether Xen & QTest need tcg_commit?

No, they don't.

Paolo

>>
>>> +memory_listener_register(&tcg_memory_listener, 
>>> &address_space_memory);
>>> +}
>>>  }
>>>  
>>>  MemoryRegion *get_system_memory(void)
>>
>

Re: [Qemu-devel] [RFC v3 3/5] hw/arm/digic: add timer support

2013-09-03 Thread Peter Crosthwaite

On Wed, Sep 4, 2013 at 3:21 PM, Antony Pavlov  wrote:
> Signed-off-by: Antony Pavlov 
> ---
>  hw/arm/digic.c |  25 ++
>  hw/timer/Makefile.objs |   1 +
>  hw/timer/digic-timer.c | 122 
> +
>  hw/timer/digic-timer.h |  19 
>  include/hw/arm/digic.h |   7 +++
>  5 files changed, 174 insertions(+)
>  create mode 100644 hw/timer/digic-timer.c
>  create mode 100644 hw/timer/digic-timer.h
>
> diff --git a/hw/arm/digic.c b/hw/arm/digic.c
> index 95a9fcd..a71364b 100644
> --- a/hw/arm/digic.c
> +++ b/hw/arm/digic.c
> @@ -30,21 +30,46 @@
>  static void digic_init(Object *obj)
>  {
>  DigicState *s = DIGIC(obj);
> +DeviceState *dev;
> +int i;
>
>  object_initialize(&s->cpu, sizeof(s->cpu), "arm946-" TYPE_ARM_CPU);
>  object_property_add_child(obj, "cpu", OBJECT(&s->cpu), NULL);
> +
> +for (i = 0; i < DIGIC4_NB_TIMERS; i++) {
> +char name[9];
> +

Bit of a trap if theres every more than 10 timers as the name string
will run off the end if %d is below with 10. Its ok to round up a
little just for defensiveness.

> +object_initialize(&s->timer[i], sizeof(s->timer[i]), 
> TYPE_DIGIC_TIMER);
> +dev = DEVICE(&s->timer[i]);
> +qdev_set_parent_bus(dev, sysbus_get_default());
> +snprintf(name, 9, "timer[%d]", i);
> +object_property_add_child(obj, name, OBJECT(&s->timer[i]), NULL);
> +}
>  }
>
>  static void digic_realize(DeviceState *dev, Error **errp)
>  {
>  DigicState *s = DIGIC(dev);
>  Error *err = NULL;
> +SysBusDevice *sbd;
> +int i;
>
>  object_property_set_bool(OBJECT(&s->cpu), true, "realized", &err);
>  if (err != NULL) {
>  error_propagate(errp, err);
>  return;
>  }
> +
> +for (i = 0; i < DIGIC4_NB_TIMERS; i++) {
> +object_property_set_bool(OBJECT(&s->timer[i]), true, "realized", 
> &err);
> +if (err != NULL) {
> +error_propagate(errp, err);
> +return;
> +}
> +
> +sbd = SYS_BUS_DEVICE(&s->timer[i]);
> +sysbus_mmio_map(sbd, 0, DIGIC4_TIMER_BASE(i));
> +}
>  }
>
>  static void digic_class_init(ObjectClass *oc, void *data)
> diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs
> index eca5905..5479aee 100644
> --- a/hw/timer/Makefile.objs
> +++ b/hw/timer/Makefile.objs
> @@ -25,5 +25,6 @@ obj-$(CONFIG_OMAP) += omap_synctimer.o
>  obj-$(CONFIG_PXA2XX) += pxa2xx_timer.o
>  obj-$(CONFIG_SH4) += sh_timer.o
>  obj-$(CONFIG_TUSB6010) += tusb6010.o
> +obj-$(CONFIG_DIGIC) += digic-timer.o
>
>  obj-$(CONFIG_MC146818RTC) += mc146818rtc.o
> diff --git a/hw/timer/digic-timer.c b/hw/timer/digic-timer.c
> new file mode 100644
> index 000..c6cf7ee
> --- /dev/null
> +++ b/hw/timer/digic-timer.c
> @@ -0,0 +1,122 @@
> +/*
> + * QEMU model of the Canon Digic timer block.
> + *
> + * Copyright (C) 2013 Antony Pavlov 
> + *
> + * This model is based on reverse engineering efforts
> + * made by CHDK (http://chdk.wikia.com) and
> + * Magic Lantern (http://www.magiclantern.fm) projects
> + * contributors.
> + *
> + * See "Timer/Clock Module" docs here:
> + *   http://magiclantern.wikia.com/wiki/Register_Map
> + *
> + * The QEMU model of the OSTimer in PKUnity SoC by Guan Xuetao
> + * is used as a template.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + *
> + */
> +
> +#include "hw/sysbus.h"
> +#include "hw/ptimer.h"
> +#include "qemu/main-loop.h"
> +
> +#include "hw/timer/digic-timer.h"
> +
> +#ifdef DEBUG_DIGIC_TIMER
> +#define DPRINTF(fmt, ...) printf("%s: " fmt , __func__, ## __VA_ARGS__)
> +#else
> +#define DPRINTF(fmt, ...) do {} while (0)
> +#endif

I think we were encouraging unconditional compilation of error printfs
rather than stripping them. This means bugs in maybe change patterns
involving types which affect debug printfs can be caught in developer
compile testing.

Something like this would work, although Andreas played with this
recently and may have more convincing ideas.

#ifndef XILINX_SPIPS_ERR_DEBUG
#define XILINX_SPIPS_ERR_DEBUG 0
#endif

#define DB_PRINT_L(level, ...) do { \
if (XILINX_SPIPS_ERR_DEBUG > (level)) { \
fprintf(stderr,  ": %s: ", __func__); \
fprintf(stderr, ## __VA_ARGS__); \
} \
} while (0);

> +
> +# define DIGIC_TIMER_CONTROL 0x00
> +# define DIGIC_T

Re: [Qemu-devel] [PATCH v2 01/10] prep: kill get_system_io() usage

2013-09-03 Thread Paolo Bonzini

Il 04/09/2013 00:29, Hervé Poussineau ha scritto:
> While ISA address space in prep machine is currently the one returned
> by get_system_io(), this depends of the implementation of i82378/raven
> devices, and this may not be the case forever.
> 
> Use the right ISA address space when adding some more ports to it.
> We can use whatever ISA device on the right ISA bus, as all ISA devices
> on the same ISA bus share the same ISA address space.
> 
> Signed-off-by: Hervé Poussineau 
> ---
>  hw/ppc/prep.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
> index 7e04b1a..efc892d 100644
> --- a/hw/ppc/prep.c
> +++ b/hw/ppc/prep.c
> @@ -656,7 +656,7 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
>  sysctrl->reset_irq = cpu->env.irq_inputs[PPC6xx_INPUT_HRESET];
>  
>  portio_list_init(port_list, NULL, prep_portio_list, sysctrl, "prep");
> -portio_list_add(port_list, get_system_io(), 0x0);
> +portio_list_add(port_list, isa_address_space_io(isa), 0x0);
>  
>  /* PowerPC control and status register group */
>  #if 0
> 

Should it be instead the I/O address space of the Raven instead
(pci_bus->address_space_io, or alternatively you could add a new
function)?  Or if you make it the ISA address space, should this be a
new, full-blown ISA device?

Paolo

Re: [Qemu-devel] [PATCH v2 05/10] raven: set a correct PCI I/O memory region

2013-09-03 Thread Paolo Bonzini

Il 04/09/2013 00:29, Hervé Poussineau ha scritto:
> PCI I/O region is 0x3f80 bytes starting at 0x8000.
> Do not use global QEMU I/O region, which is only 64KB.

You can make the global QEMU I/O region larger, that's not a problem.

Not using address_space_io is fine as well, but it's a separate change
and I doubt it is a good idea to do it for a single target; if you do it
for all non-x86 PCI bridges, and move the initialization of
address_space_io to target-i386, that's a different story of course.

Paolo

> Signed-off-by: Hervé Poussineau 
> ---
>  hw/pci-host/prep.c |   15 +--
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
> index 95fa2ea..af0bf2b 100644
> --- a/hw/pci-host/prep.c
> +++ b/hw/pci-host/prep.c
> @@ -53,6 +53,7 @@ typedef struct PRePPCIState {
>  
>  qemu_irq irq[PCI_NUM_PINS];
>  PCIBus pci_bus;
> +MemoryRegion pci_io;
>  MemoryRegion pci_intack;
>  RavenPCIState pci_dev;
>  } PREPPCIState;
> @@ -136,13 +137,11 @@ static void raven_pcihost_realizefn(DeviceState *d, 
> Error **errp)
>  
>  memory_region_init_io(&h->conf_mem, OBJECT(h), &pci_host_conf_be_ops, s,
>"pci-conf-idx", 1);
> -sysbus_add_io(dev, 0xcf8, &h->conf_mem);
> -sysbus_init_ioports(&h->busdev, 0xcf8, 1);
> +memory_region_add_subregion(&s->pci_io, 0xcf8, &h->conf_mem);
>  
>  memory_region_init_io(&h->data_mem, OBJECT(h), &pci_host_data_be_ops, s,
>"pci-conf-data", 1);
> -sysbus_add_io(dev, 0xcfc, &h->data_mem);
> -sysbus_init_ioports(&h->busdev, 0xcfc, 1);
> +memory_region_add_subregion(&s->pci_io, 0xcfc, &h->data_mem);
>  
>  memory_region_init_io(&h->mmcfg, OBJECT(s), &PPC_PCIIO_ops, s, "pciio", 
> 0x0040);
>  memory_region_add_subregion(address_space_mem, 0x8080, &h->mmcfg);
> @@ -160,11 +159,15 @@ static void raven_pcihost_initfn(Object *obj)
>  PCIHostState *h = PCI_HOST_BRIDGE(obj);
>  PREPPCIState *s = RAVEN_PCI_HOST_BRIDGE(obj);
>  MemoryRegion *address_space_mem = get_system_memory();
> -MemoryRegion *address_space_io = get_system_io();
>  DeviceState *pci_dev;
>  
> +memory_region_init(&s->pci_io, obj, "pci-io", 0x3f80);
> +
> +/* CPU address space */
> +memory_region_add_subregion(address_space_mem, 0x8000, &s->pci_io);
>  pci_bus_new_inplace(&s->pci_bus, DEVICE(obj), NULL,
> -address_space_mem, address_space_io, 0, 
> TYPE_PCI_BUS);
> +address_space_mem, &s->pci_io, 0, TYPE_PCI_BUS);
> +
>  h->bus = &s->pci_bus;
>  
>  object_initialize(&s->pci_dev, TYPE_RAVEN_PCI_DEVICE);
>

Re: [Qemu-devel] [PATCH v4 0/3] bugs fix for hpet

2013-09-03 Thread liu ping fan

On Tue, Sep 3, 2013 at 7:17 PM, Paolo Bonzini  wrote:
> Il 02/09/2013 09:06, Liu Ping Fan ha scritto:
>> note: I rebase it onto Stefan's net-next tree, since pc-1.7 has already been 
>> defined there.
>>
>> v4:
>>   use standard compat property to set hpet's interrupt compatibility
>>
>> v3:
>>   change hpet interrupt capablity on board's demand
>>
>>
>> Liu Ping Fan (3):
>>   hpet: inverse polarity when pin above ISA_NUM_IRQS
>>   hpet: entitle more irq pins for hpet
>>   pc-1.6: add compatibility for hpet intcap on pc-*-1.6
>>
>>  hw/timer/hpet.c  | 27 +++
>>  include/hw/i386/pc.h |  5 +
>>  2 files changed, 28 insertions(+), 4 deletions(-)
>>
>
> Looks good.  But I have one question; should this be changed for PIIX
> too, or should the 1.7 PIIX machine keep the old behavior?  (I have no
> idea).
>
Your suspicion is right.  When going through PIIX4 spec, I found that
the chipset was without ioapic integrated. So there is divergence for
the compatibility of pc-piix-* and pc-q35-*.  Can I code the hpet's
compatiblity in pc-piix-1.7 to resolve this?

Regards,
Pingfan
> Paolo

[Qemu-devel] [RFC v3 5/5] hw/arm/digic: add NOR ROM support

2013-09-03 Thread Antony Pavlov

Signed-off-by: Antony Pavlov 
---
 hw/arm/digic_boards.c | 74 +++
 1 file changed, 74 insertions(+)

diff --git a/hw/arm/digic_boards.c b/hw/arm/digic_boards.c
index 0b99227..b5a9e1a 100644
--- a/hw/arm/digic_boards.c
+++ b/hw/arm/digic_boards.c
@@ -1,6 +1,13 @@
 #include "hw/boards.h"
 #include "exec/address-spaces.h"
 #include "hw/arm/digic.h"
+#include "hw/block/flash.h"
+#include "hw/loader.h"
+#include "sysemu/sysemu.h"
+
+#define DIGIC4_ROM0_BASE  0xf000
+#define DIGIC4_ROM1_BASE  0xf800
+# define DIGIC4_ROM_MAX_SIZE  0x0800
 
 typedef struct DigicBoardState {
 DigicState *digic;
@@ -9,6 +16,10 @@ typedef struct DigicBoardState {
 
 typedef struct DigicBoard {
 hwaddr ram_size;
+void (*add_rom0)(DigicBoardState *, hwaddr, const char *);
+const char *rom0_def_filename;
+void (*add_rom1)(DigicBoardState *, hwaddr, const char *);
+const char *rom1_def_filename;
 hwaddr start_addr;
 } DigicBoard;
 
@@ -35,11 +46,74 @@ static void digic4_board_init(DigicBoard *board)
 
 digic4_board_setup_ram(s, board->ram_size);
 
+if (board->add_rom0) {
+board->add_rom0(s, DIGIC4_ROM0_BASE, board->rom0_def_filename);
+}
+
+if (board->add_rom1) {
+board->add_rom1(s, DIGIC4_ROM1_BASE, board->rom1_def_filename);
+}
+
 s->digic->cpu.env.regs[15] = board->start_addr;
 }
 
+static void digic_load_rom(DigicBoardState *s, hwaddr addr,
+hwaddr max_size, const char *def_filename)
+{
+
+target_long rom_size;
+const char *filename;
+
+if (bios_name) {
+filename = bios_name;
+} else {
+filename = def_filename;
+}
+
+if (filename) {
+char *fn = qemu_find_file(QEMU_FILE_TYPE_BIOS, filename);
+
+if (!fn) {
+fprintf(stderr, "Couldn't find rom image '%s'.\n", filename);
+exit(1);
+}
+
+rom_size = load_image_targphys(fn, addr, max_size);
+if (rom_size < 0 || rom_size > max_size) {
+fprintf(stderr, "Couldn't load rom image '%s'\n", filename);
+exit(1);
+}
+}
+}
+
+static void digic4_add_k8p3215uqb_rom(DigicBoardState *s, hwaddr addr,
+const char *def_filename)
+{
+#define FLASH_K8P3215UQB_SIZE (4 * 1024 * 1024)
+#define FLASH_K8P3215UQB_SECTOR_SIZE (64 * 1024)
+
+/*
+ * Samsung K8P3215UQB:
+ *  * AMD command set;
+ *  * multiple sector size: some sectors are 8K the other ones are 64K.
+ * Alas! The pflash_cfi02_register() function creates a flash
+ * device with unified sector size.
+ */
+pflash_cfi02_register(addr, NULL, "pflash", FLASH_K8P3215UQB_SIZE,
+NULL, FLASH_K8P3215UQB_SECTOR_SIZE,
+FLASH_K8P3215UQB_SIZE / FLASH_K8P3215UQB_SECTOR_SIZE,
+DIGIC4_ROM_MAX_SIZE / FLASH_K8P3215UQB_SIZE,
+4,
+0x00EC, 0x007E, 0x0003, 0x0001,
+0x0555, 0x2aa, 0);
+
+digic_load_rom(s, addr, FLASH_K8P3215UQB_SIZE, def_filename);
+}
+
 static DigicBoard digic4_board_canon_a1100 = {
 .ram_size = 64 * 1024 * 1024,
+.add_rom1 = digic4_add_k8p3215uqb_rom,
+.rom1_def_filename = "canon-a1100-rom1.bin",
 /* CHDK recommends this address for ROM disassembly */
 .start_addr = 0xffc0,
 };
-- 
1.8.4.rc3

[Qemu-devel] [RFC v3 4/5] hw/arm/digic: add UART support

2013-09-03 Thread Antony Pavlov

Signed-off-by: Antony Pavlov 
---
 hw/arm/digic.c |  14 
 hw/char/Makefile.objs  |   1 +
 hw/char/digic-uart.c   | 197 +
 hw/char/digic-uart.h   |  27 +++
 include/hw/arm/digic.h |   4 +
 5 files changed, 243 insertions(+)
 create mode 100644 hw/char/digic-uart.c
 create mode 100644 hw/char/digic-uart.h

diff --git a/hw/arm/digic.c b/hw/arm/digic.c
index a71364b..89ab61c 100644
--- a/hw/arm/digic.c
+++ b/hw/arm/digic.c
@@ -45,6 +45,11 @@ static void digic_init(Object *obj)
 snprintf(name, 9, "timer[%d]", i);
 object_property_add_child(obj, name, OBJECT(&s->timer[i]), NULL);
 }
+
+object_initialize(&s->uart, sizeof(s->uart), TYPE_DIGIC_UART);
+dev = DEVICE(&s->uart);
+qdev_set_parent_bus(dev, sysbus_get_default());
+object_property_add_child(obj, "uart", OBJECT(&s->uart), NULL);
 }
 
 static void digic_realize(DeviceState *dev, Error **errp)
@@ -70,6 +75,15 @@ static void digic_realize(DeviceState *dev, Error **errp)
 sbd = SYS_BUS_DEVICE(&s->timer[i]);
 sysbus_mmio_map(sbd, 0, DIGIC4_TIMER_BASE(i));
 }
+
+object_property_set_bool(OBJECT(&s->uart), true, "realized", &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+
+sbd = SYS_BUS_DEVICE(&s->uart);
+sysbus_mmio_map(sbd, 0, DIGIC_UART_BASE);
 }
 
 static void digic_class_init(ObjectClass *oc, void *data)
diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
index f8f3dbc..00d37ac 100644
--- a/hw/char/Makefile.objs
+++ b/hw/char/Makefile.objs
@@ -14,6 +14,7 @@ obj-$(CONFIG_COLDFIRE) += mcf_uart.o
 obj-$(CONFIG_OMAP) += omap_uart.o
 obj-$(CONFIG_SH4) += sh_serial.o
 obj-$(CONFIG_PSERIES) += spapr_vty.o
+obj-$(CONFIG_DIGIC) += digic-uart.o
 
 common-obj-$(CONFIG_ETRAXFS) += etraxfs_ser.o
 common-obj-$(CONFIG_ISA_DEBUG) += debugcon.o
diff --git a/hw/char/digic-uart.c b/hw/char/digic-uart.c
new file mode 100644
index 000..bb581f0
--- /dev/null
+++ b/hw/char/digic-uart.c
@@ -0,0 +1,197 @@
+/*
+ * QEMU model of the Canon Digic UART block.
+ *
+ * Copyright (C) 2013 Antony Pavlov 
+ *
+ * This model is based on reverse engineering efforts
+ * made by CHDK (http://chdk.wikia.com) and
+ * Magic Lantern (http://www.magiclantern.fm) projects
+ * contributors.
+ *
+ * See "Serial terminal" docs here:
+ *   http://magiclantern.wikia.com/wiki/Register_Map#Misc_Registers
+ *
+ * The QEMU model of the Milkymist UART block by Michael Walle
+ * is used as a template.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "hw/hw.h"
+#include "hw/sysbus.h"
+#include "sysemu/char.h"
+
+#include "hw/char/digic-uart.h"
+
+enum {
+ST_RX_RDY = (1 << 0),
+ST_TX_RDY = (1 << 1),
+};
+
+static uint64_t digic_uart_read(void *opaque, hwaddr addr,
+  unsigned size)
+{
+DigicUartState *s = opaque;
+
+addr >>= 2;
+
+switch (addr) {
+case R_RX:
+s->regs[R_ST] &= ~(ST_RX_RDY);
+break;
+
+case R_ST:
+break;
+
+default:
+qemu_log_mask(LOG_UNIMP,
+"digic_uart: read access to unknown register 0x"
+TARGET_FMT_plx, addr << 2);
+}
+
+return s->regs[addr];
+}
+
+static void digic_uart_write(void *opaque, hwaddr addr, uint64_t value,
+   unsigned size)
+{
+DigicUartState *s = opaque;
+unsigned char ch = value;
+
+addr >>= 2;
+
+switch (addr) {
+case R_TX:
+if (s->chr) {
+qemu_chr_fe_write_all(s->chr, &ch, 1);
+}
+break;
+
+case R_ST:
+/*
+ * Ignore write to R_ST.
+ *
+ * The point is that this register is actively used
+ * during receiving and transmitting symbols,
+ * but we don't know the function of most of bits.
+ *
+ * Ignoring writes to R_ST is only a simplification
+ * of the model. It has no perceptible side effects
+ * for existing guests.
+ */
+break;
+
+default:
+qemu_log_mask(LOG_UNIMP,
+"digic_uart: write access to unknown register 0x"
+TARGET_FMT_plx, addr << 2);
+}
+}
+
+static const MemoryRegionOps uart_mmio_ops = {
+.read = digic_uart_read,
+.write = digic_uart_write,
+.valid = {
+.min_access_size = 4,
+.max_acces

[Qemu-devel] [RFC v3 3/5] hw/arm/digic: add timer support

2013-09-03 Thread Antony Pavlov

Signed-off-by: Antony Pavlov 
---
 hw/arm/digic.c |  25 ++
 hw/timer/Makefile.objs |   1 +
 hw/timer/digic-timer.c | 122 +
 hw/timer/digic-timer.h |  19 
 include/hw/arm/digic.h |   7 +++
 5 files changed, 174 insertions(+)
 create mode 100644 hw/timer/digic-timer.c
 create mode 100644 hw/timer/digic-timer.h

diff --git a/hw/arm/digic.c b/hw/arm/digic.c
index 95a9fcd..a71364b 100644
--- a/hw/arm/digic.c
+++ b/hw/arm/digic.c
@@ -30,21 +30,46 @@
 static void digic_init(Object *obj)
 {
 DigicState *s = DIGIC(obj);
+DeviceState *dev;
+int i;
 
 object_initialize(&s->cpu, sizeof(s->cpu), "arm946-" TYPE_ARM_CPU);
 object_property_add_child(obj, "cpu", OBJECT(&s->cpu), NULL);
+
+for (i = 0; i < DIGIC4_NB_TIMERS; i++) {
+char name[9];
+
+object_initialize(&s->timer[i], sizeof(s->timer[i]), TYPE_DIGIC_TIMER);
+dev = DEVICE(&s->timer[i]);
+qdev_set_parent_bus(dev, sysbus_get_default());
+snprintf(name, 9, "timer[%d]", i);
+object_property_add_child(obj, name, OBJECT(&s->timer[i]), NULL);
+}
 }
 
 static void digic_realize(DeviceState *dev, Error **errp)
 {
 DigicState *s = DIGIC(dev);
 Error *err = NULL;
+SysBusDevice *sbd;
+int i;
 
 object_property_set_bool(OBJECT(&s->cpu), true, "realized", &err);
 if (err != NULL) {
 error_propagate(errp, err);
 return;
 }
+
+for (i = 0; i < DIGIC4_NB_TIMERS; i++) {
+object_property_set_bool(OBJECT(&s->timer[i]), true, "realized", &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+
+sbd = SYS_BUS_DEVICE(&s->timer[i]);
+sysbus_mmio_map(sbd, 0, DIGIC4_TIMER_BASE(i));
+}
 }
 
 static void digic_class_init(ObjectClass *oc, void *data)
diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs
index eca5905..5479aee 100644
--- a/hw/timer/Makefile.objs
+++ b/hw/timer/Makefile.objs
@@ -25,5 +25,6 @@ obj-$(CONFIG_OMAP) += omap_synctimer.o
 obj-$(CONFIG_PXA2XX) += pxa2xx_timer.o
 obj-$(CONFIG_SH4) += sh_timer.o
 obj-$(CONFIG_TUSB6010) += tusb6010.o
+obj-$(CONFIG_DIGIC) += digic-timer.o
 
 obj-$(CONFIG_MC146818RTC) += mc146818rtc.o
diff --git a/hw/timer/digic-timer.c b/hw/timer/digic-timer.c
new file mode 100644
index 000..c6cf7ee
--- /dev/null
+++ b/hw/timer/digic-timer.c
@@ -0,0 +1,122 @@
+/*
+ * QEMU model of the Canon Digic timer block.
+ *
+ * Copyright (C) 2013 Antony Pavlov 
+ *
+ * This model is based on reverse engineering efforts
+ * made by CHDK (http://chdk.wikia.com) and
+ * Magic Lantern (http://www.magiclantern.fm) projects
+ * contributors.
+ *
+ * See "Timer/Clock Module" docs here:
+ *   http://magiclantern.wikia.com/wiki/Register_Map
+ *
+ * The QEMU model of the OSTimer in PKUnity SoC by Guan Xuetao
+ * is used as a template.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "hw/sysbus.h"
+#include "hw/ptimer.h"
+#include "qemu/main-loop.h"
+
+#include "hw/timer/digic-timer.h"
+
+#ifdef DEBUG_DIGIC_TIMER
+#define DPRINTF(fmt, ...) printf("%s: " fmt , __func__, ## __VA_ARGS__)
+#else
+#define DPRINTF(fmt, ...) do {} while (0)
+#endif
+
+# define DIGIC_TIMER_CONTROL 0x00
+# define DIGIC_TIMER_VALUE 0x0c
+
+static uint64_t digic_timer_read(void *opaque, hwaddr offset,
+unsigned size)
+{
+DigicTimerState *s = opaque;
+uint32_t ret = 0;
+
+switch (offset) {
+case DIGIC_TIMER_VALUE:
+ret = (uint32_t)ptimer_get_count(s->ptimer);
+ret = ret & 0x;
+break;
+default:
+DPRINTF("Bad offset %x\n", (int)offset);
+}
+
+DPRINTF("offset 0x%x, value 0x%x\n", offset, ret);
+return ret;
+}
+
+static void digic_timer_write(void *opaque, hwaddr offset,
+uint64_t value, unsigned size)
+{
+DigicTimerState *s = opaque;
+
+/* FIXME: just now we ignore timer enable bit */
+ptimer_set_limit(s->ptimer, 0x, 1);
+ptimer_run(s->ptimer, 1);
+}
+
+static const MemoryRegionOps digic_timer_ops = {
+.read = digic_timer_read,
+.write = digic_timer_write,
+.impl = {
+.min_access_size = 4,
+.max_access_size = 4,
+},
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void digic_timer_tick(void *opaque)
+{
+DigicTimerState *s = opaque;
+
+

[Qemu-devel] [RFC v3 2/5] hw/arm/digic: prepare DIGIC-based boards support

2013-09-03 Thread Antony Pavlov

Also this patch adds initial support for Canon
PowerShot A1100 IS compact camera.

Signed-off-by: Antony Pavlov 
---
 hw/arm/Makefile.objs  |  2 +-
 hw/arm/digic_boards.c | 63 +++
 2 files changed, 64 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/digic_boards.c

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index e140485..f6e9533 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -1,4 +1,4 @@
-obj-y += boot.o collie.o exynos4_boards.o gumstix.o highbank.o
+obj-y += boot.o collie.o digic_boards.o exynos4_boards.o gumstix.o highbank.o
 obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
 obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
 obj-y += tosa.o versatilepb.o vexpress.o xilinx_zynq.o z2.o
diff --git a/hw/arm/digic_boards.c b/hw/arm/digic_boards.c
new file mode 100644
index 000..0b99227
--- /dev/null
+++ b/hw/arm/digic_boards.c
@@ -0,0 +1,63 @@
+#include "hw/boards.h"
+#include "exec/address-spaces.h"
+#include "hw/arm/digic.h"
+
+typedef struct DigicBoardState {
+DigicState *digic;
+MemoryRegion ram;
+} DigicBoardState;
+
+typedef struct DigicBoard {
+hwaddr ram_size;
+hwaddr start_addr;
+} DigicBoard;
+
+static void digic4_board_setup_ram(DigicBoardState *s, hwaddr ram_size)
+{
+memory_region_init_ram(&s->ram, NULL, "ram", ram_size);
+memory_region_add_subregion(get_system_memory(), 0, &s->ram);
+vmstate_register_ram_global(&s->ram);
+}
+
+static void digic4_board_init(DigicBoard *board)
+{
+Error *err = NULL;
+
+DigicBoardState *s = g_new(DigicBoardState, 1);
+
+s->digic = DIGIC(object_new(TYPE_DIGIC));
+object_property_set_bool(OBJECT(s->digic), true, "realized", &err);
+if (err != NULL) {
+fprintf(stderr, "Couldn't realize DIGIC SoC: %s\n",
+error_get_pretty(err));
+exit(1);
+}
+
+digic4_board_setup_ram(s, board->ram_size);
+
+s->digic->cpu.env.regs[15] = board->start_addr;
+}
+
+static DigicBoard digic4_board_canon_a1100 = {
+.ram_size = 64 * 1024 * 1024,
+/* CHDK recommends this address for ROM disassembly */
+.start_addr = 0xffc0,
+};
+
+static void canon_a1100_init(QEMUMachineInitArgs *args)
+{
+digic4_board_init(&digic4_board_canon_a1100);
+}
+
+static QEMUMachine canon_a1100 = {
+.name = "canon-a1100",
+.desc = "Canon PowerShot A1100 IS",
+.init = &canon_a1100_init,
+};
+
+static void digic_register_machines(void)
+{
+qemu_register_machine(&canon_a1100);
+}
+
+machine_init(digic_register_machines)
-- 
1.8.4.rc3

[Qemu-devel] [RFC v3 1/5] hw/arm: add very initial support for Canon DIGIC SoC

2013-09-03 Thread Antony Pavlov

DIGIC is Canon Inc.'s name for a family of SoC
for digital cameras and camcorders.

There is no publicly available specification for
DIGIC chips. All information about DIGIC chip
internals is based on reverse engineering efforts
made by CHDK (http://chdk.wikia.com) and
Magic Lantern (http://www.magiclantern.fm) projects
contributors.

Signed-off-by: Antony Pavlov 
---
 default-configs/arm-softmmu.mak |  1 +
 hw/arm/Makefile.objs|  2 +-
 hw/arm/digic.c  | 70 +
 include/hw/arm/digic.h  | 23 ++
 4 files changed, 95 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/digic.c
 create mode 100644 include/hw/arm/digic.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index ac0815d..0d1d783 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -63,6 +63,7 @@ CONFIG_FRAMEBUFFER=y
 CONFIG_XILINX_SPIPS=y
 
 CONFIG_A9SCU=y
+CONFIG_DIGIC=y
 CONFIG_MARVELL_88W8618=y
 CONFIG_OMAP=y
 CONFIG_TSC210X=y
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 3671b42..e140485 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -3,5 +3,5 @@ obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
 obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
 obj-y += tosa.o versatilepb.o vexpress.o xilinx_zynq.o z2.o
 
-obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
+obj-y += armv7m.o digic.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
 obj-y += omap1.o omap2.o strongarm.o
diff --git a/hw/arm/digic.c b/hw/arm/digic.c
new file mode 100644
index 000..95a9fcd
--- /dev/null
+++ b/hw/arm/digic.c
@@ -0,0 +1,70 @@
+/*
+ * QEMU model of the Canon DIGIC SoC.
+ *
+ * Copyright (C) 2013 Antony Pavlov 
+ *
+ * This model is based on reverse engineering efforts
+ * made by CHDK (http://chdk.wikia.com) and
+ * Magic Lantern (http://www.magiclantern.fm) projects
+ * contributors.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "hw/sysbus.h"
+#include "target-arm/cpu-qom.h"
+#include "hw/arm/digic.h"
+
+static void digic_init(Object *obj)
+{
+DigicState *s = DIGIC(obj);
+
+object_initialize(&s->cpu, sizeof(s->cpu), "arm946-" TYPE_ARM_CPU);
+object_property_add_child(obj, "cpu", OBJECT(&s->cpu), NULL);
+}
+
+static void digic_realize(DeviceState *dev, Error **errp)
+{
+DigicState *s = DIGIC(dev);
+Error *err = NULL;
+
+object_property_set_bool(OBJECT(&s->cpu), true, "realized", &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+}
+
+static void digic_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+
+dc->realize = digic_realize;
+}
+
+static const TypeInfo digic_type_info = {
+.name = TYPE_DIGIC,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(DigicState),
+.instance_init = digic_init,
+.class_init = digic_class_init,
+};
+
+static void digic_register_types(void)
+{
+type_register_static(&digic_type_info);
+}
+
+type_init(digic_register_types)
diff --git a/include/hw/arm/digic.h b/include/hw/arm/digic.h
new file mode 100644
index 000..0ef4723
--- /dev/null
+++ b/include/hw/arm/digic.h
@@ -0,0 +1,23 @@
+/*
+ * Misc DIGIC declarations
+ *
+ * Copyright (C) 2013 Antony Pavlov 
+ *
+ */
+
+#ifndef __DIGIC_H__
+#define __DIGIC_H__
+
+#include "cpu-qom.h"
+
+#define TYPE_DIGIC "digic"
+
+#define DIGIC(obj) OBJECT_CHECK(DigicState, (obj), TYPE_DIGIC)
+
+typedef struct DigicState {
+Object parent_obj;
+
+ARMCPU cpu;
+} DigicState;
+
+#endif /* __DIGIC_H__ */
-- 
1.8.4.rc3

[Qemu-devel] [RFC v3 0/5] hw/arm: add initial support for Canon DIGIC SoC

2013-09-03 Thread Antony Pavlov

[RFC v3 1/5] hw/arm: add very initial support for Canon DIGIC SoC
[RFC v3 2/5] hw/arm/digic: prepare DIGIC-based boards support
[RFC v3 3/5] hw/arm/digic: add timer support
[RFC v3 4/5] hw/arm/digic: add UART support
[RFC v3 5/5] hw/arm/digic: add NOR ROM support

 Changes since v2:

 1. rebase over latest master;
   * pass available size to object_initialize().
 2. digic-uart: qemu_log: use LOG_UNIMP instead LOG_GUEST_ERROR;
 3. digic-boards: update rom image load code: introduce digic_load_rom().

 Changes since v1:

 0. drop the "add ARM946E-S CPU" patch;
 1. convert to QOM, split DIGIC SoC code and board code
 (thanks to Andreas Fa:rber, Peter Maydell and Peter Crosthwaite);
 2. fix digic-uart (many thanks to Peter Crosthwaite for his comments);
 3. digic-boards: digic4_add_k8p3215uqb_rom(): update rom image load code: use 
-bios option.


 DIGIC is Canon Inc.'s name for a family of SoC
 for digital cameras and camcorders.

 See http://en.wikipedia.org/wiki/DIGIC for details.

 There is no publicly available specification for
 DIGIC chips. All information about DIGIC chip
 internals is based on reverse engineering efforts
 made by CHDK (http://chdk.wikia.com) and
 Magic Lantern (http://www.magiclantern.fm) projects
 contributors.

 Also this patch series adds initial support for Canon
 PowerShot A1100 IS compact camera (it is my only camera
 with connected UART interface). As the DIGIC-based cameras
 differences mostly are unsignificant (e.g. RAM-size,
 ROM type and size, GPIO usage) the other compact
 and DSLR cameras support can be easely added.

 This DIGIC support patch series is inspired
 by EOS QEMU from Magic Lantern project.
 The main differences:
  * EOS QEMU uses home-brew all-in-one monolith design;
  this patch series uses conventional qemu object-centric design;
  * EOS QEMU tries provide simplest emulation for most
  controllers inside SoC to run Magic Lantern firmware;
  this patch series provide more complete support
  only for core devices to run barebox bootloader.
   ** EOS QEMU does not support timer counting
   (this patch series emulate 1 MHz counting);
   ** EOS QEMU support DIGIC UART only for output
   character to stderr; (this patch series emulate
   introduces full blown UART interface);
   ** EOS QEMU has incomplete ROM support;
   (this patch series uses conventional qemu pflash).

 This initial DIGIC support can't be used to run
 the original camera firmware, but it can successfully
 run experimental version of barebox bootloader
 (see http://www.barebox.org).

 The last sources of barebox for PowerShot A1100 can be
 obtained here:
   https://github.com/frantony/barebox/tree/next.digic.20130829

 The precompiled ROM image usable with qemu can be
 obtained here:

 
https://github.com/frantony/barebox/blob/next.digic.20130829/canon-a1100-rom1.bin

 This ROM image (after "dancing bit" encoding) can be run on
 real Canon A1100 camera.

 The short build instruction for __previous__ DIGIC barebox
 version (it can be used with more recent sources too) can
 be obtained here:
   http://lists.infradead.org/pipermail/barebox/2013-August/016007.html

[Qemu-devel] [PATCH] migration: add version supporting macros for struct pointer

2013-09-03 Thread Alexey Kardashevskiy

This adds version supporting macros VMSTATE_STRUCT_POINTER_TEST_V
and VMSTATE_STRUCT_POINTER_V in addition to the already existing
VMSTATE_STRUCT_POINTER and VMSTATE_STRUCT_POINTER_TEST macros.

Cc: Andreas Färber 
Signed-off-by: Alexey Kardashevskiy 
---
 include/migration/vmstate.h | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 1c31b5d..9d09e60 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -310,8 +310,18 @@ extern const VMStateInfo vmstate_info_bitmap;
 .offset   = vmstate_offset_value(_state, _field, _type), \
 }
 
-#define VMSTATE_STRUCT_POINTER_TEST(_field, _state, _test, _vmsd, _type) { \
+#define VMSTATE_STRUCT_POINTER_V(_field, _state, _version, _vmsd, _type) { \
 .name = (stringify(_field)), \
+.version_id   = (_version),\
+.vmsd = &(_vmsd),\
+.size = sizeof(_type),   \
+.flags= VMS_STRUCT|VMS_POINTER,  \
+.offset   = vmstate_offset_value(_state, _field, _type), \
+}
+
+#define VMSTATE_STRUCT_POINTER_TEST_V(_field, _state, _test, _version, _vmsd, 
_type) { \
+.name = (stringify(_field)), \
+.version_id   = (_version),\
 .field_exists = (_test), \
 .vmsd = &(_vmsd),\
 .size = sizeof(_type),   \
@@ -497,7 +507,10 @@ extern const VMStateInfo vmstate_info_bitmap;
 VMSTATE_STRUCT_TEST(_field, _state, NULL, _version, _vmsd, _type)
 
 #define VMSTATE_STRUCT_POINTER(_field, _state, _vmsd, _type)  \
-VMSTATE_STRUCT_POINTER_TEST(_field, _state, NULL, _vmsd, _type)
+VMSTATE_STRUCT_POINTER_V(_field, _state, 0, _vmsd, _type)
+
+#define VMSTATE_STRUCT_POINTER_TEST(_field, _state, _test, _vmsd, _type) \
+VMSTATE_STRUCT_POINTER_TEST_V(_field, _state, _test, 0, _vmsd, _type)
 
 #define VMSTATE_STRUCT_ARRAY(_field, _state, _num, _version, _vmsd, _type) \
 VMSTATE_STRUCT_ARRAY_TEST(_field, _state, _num, NULL, _version,   \
-- 
1.8.4.rc4

Re: [Qemu-devel] [Qemu-stable][PATCH] rdma: fix multiple VMs parallel migration

2013-09-03 Thread Frank Yang

On 2013-9-3 13:38, Lei Li wrote:
> On 09/03/2013 12:20 PM, Frank Yang wrote:
>> Yes, it depends on low-level implementation. During my earlier test,
>
> What do you mean by the 'it depends on low-level implementation'?  Do you test
> it with IB or Ethernet?
I've tested both IB(40 GigE) and Ethernet(10 GigE). IB seems better but still 
could fail.
I don't have IB(10 GigE), so I'm not sure it's relevant to the bandwidth or 
not. 

>> using one CQ to send and receive may cause packet loss with heavy load:
>> the destination thinks it send READY message successfully but the source
>> still waits for it. This situation always happens when the destination polls
>> receive CQE first.
>>
>> So I think using only one CQ may cause packet conflict or something like 
>> that,
>> and it should be the driver bug. However, using two CQs fix the problem.
>
> If the receiver may not receive this READY message from sender under heavy 
> load caused by
> packet loss, why two CQs can avoid this?
I haven't gone deeply into the kernel and seen the implemetation. But two CQs 
make
sure that qemu will not poll receive CQE when it expects to poll send CQE, and 
truly
can avoid parallel migration failure. I've tested IB and Ethernet for dozens of 
times, all
success so far.
>>
>>
>>
>> 2013/9/2 Isaku Yamahata > >
>>
>> Hi. Can you elaborate why two CQs fix it? Does it depend on
>> HCA implementation?
>>
>> I'm not against two CQs for sending and receiving. In fact I'm for it
>> because I use two CQs for postcopy RDMA support.
>>
>> thanks,
>>
>> On Fri, Aug 30, 2013 at 08:39:31PM +0800, Frank Yang wrote:
>> > When several VMs migrate with RDMA at the same time, the
>> increased pressure
>> > cause packet loss probabilistically and make source and
>> destination wait for
>> > each other. There might be some of VMs blocked during the migration.
>> >
>> > Fix the bug by using two completion queues, for sending and
>> receiving
>> > respectively.
>> >
>> > From 0c4829495cdc89eea2e94b103ac42c3f6a4b32c2 Mon Sep 17
>> 00:00:00 2001
>> > From: Frank Yang > >
>> > Date: Fri, 30 Aug 2013 17:53:34 +0800
>> > Subject: [PATCH] rdma: fix multiple VMs parallel migration
>> >
>> > Signed-off-by: Frank Yang > >
>> > ---
>> >  migration-rdma.c | 57
>> 
>> >  1 file changed, 37 insertions(+), 20 deletions(-)
>> >
>> > diff --git a/migration-rdma.c b/migration-rdma.c
>> > index 3d1266f..d0eacbb 100644
>> > --- a/migration-rdma.c
>> > +++ b/migration-rdma.c
>> > @@ -362,7 +362,8 @@ typedef struct RDMAContext {
>> >  struct ibv_qp *qp;  /* queue pair */
>> >  struct ibv_comp_channel *comp_channel;  /* completion
>> channel */
>> >  struct ibv_pd *pd;  /* protection domain */
>> > -struct ibv_cq *cq;  /* completion queue */
>> > +struct ibv_cq *send_cq; /* send completion
>> queue */
>> > +struct ibv_cq *recv_cq; /* receive
>> completion queue */
>> >
>> >  /*
>> >   * If a previous write failed (perhaps because of a failed
>> > @@ -1006,9 +1007,12 @@ static int
>> qemu_rdma_alloc_pd_cq(RDMAContext *rdma)
>> >   * Completion queue can be filled by both read and write
>> work requests,
>> >   * so must reflect the sum of both possible queue sizes.
>> >   */
>> > -rdma->cq = ibv_create_cq(rdma->verbs,
>> (RDMA_SIGNALED_SEND_MAX * 3),
>> > +rdma->send_cq = ibv_create_cq(rdma->verbs,
>> (RDMA_SIGNALED_SEND_MAX * 2),
>> >  NULL, rdma->comp_channel, 0);
>> > -if (!rdma->cq) {
>> > +rdma->recv_cq = ibv_create_cq(rdma->verbs,
>> RDMA_SIGNALED_SEND_MAX, NULL,
>> > +rdma->comp_channel, 0);
>> > +
>> > +if (!rdma->send_cq || !rdma->recv_cq) {
>> >  fprintf(stderr, "failed to allocate completion queue\n");
>> >  goto err_alloc_pd_cq;
>> >  }
>> > @@ -1040,8 +1044,8 @@ static int qemu_rdma_alloc_qp(RDMAContext
>> *rdma)
>> >  attr.cap.max_recv_wr = 3;
>> >  attr.cap.max_send_sge = 1;
>> >  attr.cap.max_recv_sge = 1;
>> > -attr.send_cq = rdma->cq;
>> > -attr.recv_cq = rdma->cq;
>> > +attr.send_cq = rdma->send_cq;
>> > +attr.recv_cq = rdma->recv_cq;
>> >  attr.qp_type = IBV_QPT_RC;
>> >
>> >  ret = rdma_create_qp(rdma->cm_id, rdma->pd, &attr);
>> > @@ -1361,13 +1365,18 @@ static void
>> qemu_rdma_signal_unregister(RDMAContext
>> > *rdma, uint64_t index,
>> >   * Return the work request ID that completed.
>> >   */
>> >  static uint64_t qemu_rdma_poll

[Qemu-devel] question about qemu disk cache mode

2013-09-03 Thread xuanmao_001

Dear qemuers:

my qemu-kvm version is 1.0.1
I would like to figure out the qemu disk cache mode. I have visited the 
qemu-options.hx
there is two cache that I didn't understand: the host cache page and the qemu 
disk write cache.

Is "host page cache" only used for read. and "qemu disk write cache" used for 
writing.

which cache the data reached first? host page cache or qemu disk write cache?




xuanmao_001

Re: [Qemu-devel] [Qemu-stable][PATCH] rdma: fix multiple VMs parallel migration

2013-09-03 Thread Frank Yang

On 2013-9-3 22:13, Michael R. Hines wrote:
>
> No top-posting, please.
>
> On 09/03/2013 12:20 AM, Frank Yang wrote:
>> Yes, it depends on low-level implementation. During my earlier test,
>> using one CQ to send and receive may cause packet loss with heavy load:
>> the destination thinks it send READY message successfully but the source
>> still waits for it. This situation always happens when the destination polls
>> receive CQE first.
>>
>> So I think using only one CQ may cause packet conflict or something like 
>> that,
>> and it should be the driver bug. However, using two CQs fix the problem.
>>
>>
>
> This doesn't seem like a very clear answer . are you sure its packet loss?
>
> The queue pairs are supposed to be reliable - I've never experienced a 
> situation
> where packets were simply "dropped" for no reason without breaking the
> connection and putting the QP into an error state.
>
> - Michael
>
The fact is
1. The destination polls sending of READY message successfully. Either READY
 message is sent successfully indeed and the source does not receive it, or 
the
 destination dose not send READY message out at all.
2. I've tried to send READY message again by adding some codes during the 
migration.
Source can receive the READY message successfully. So the connection is not
broken and the QP works fine.

The packet loss what I'm talking about does not only refer to the loss during 
the
transmission. The message may also not be sent out successfully actually. 
ibv_poll_cq()
returns with no error, but the source dosen't receive message. For qemu, the 
message
it sent is lost.

Re: [Qemu-devel] [PATCH V11 02/11] NUMA: split -numa option

2013-09-03 Thread Wanlong Gao

On 09/04/2013 09:49 AM, Eduardo Habkost wrote:
> On Fri, Aug 30, 2013 at 11:10:41AM +0800, Wanlong Gao wrote:
>> Change -numa option like following as Paolo suggested:
>> -numa node,nodeid=0,cpus=0-1 \
>> -numa mem,nodeid=0,size=1G
>>
>> This new option will make later coming memory hotplug better.
>> And this new option is implemented using OptsVisitor.
>>
>> And just remain "-numa node,mem=xx" as legacy.
>>
>> Reviewed-by: Laszlo Ersek 
>> Signed-off-by: Wanlong Gao 
> 
> Would it be possible to first move the existing code as-is to numa.c,
> then introduce qemu_numa_opts, and then introduce "-numa mem"? It would
> make the patch much easier to review.

I thought this patch is straightforward, but if you like I can split as you 
said. ;)

> 
>> ---
>>  Makefile.target |   2 +-
>>  include/sysemu/sysemu.h |   3 +
>>  numa.c  | 144 
>> 
>>  qemu-options.hx |   6 +-
>>  vl.c| 113 ++---
>>  5 files changed, 168 insertions(+), 100 deletions(-)
>>  create mode 100644 numa.c
>>
>> diff --git a/Makefile.target b/Makefile.target
>> index 9a49852..7e1fddf 100644
>> --- a/Makefile.target
>> +++ b/Makefile.target
>> @@ -113,7 +113,7 @@ endif #CONFIG_BSD_USER
>>  #
>>  # System emulator target
>>  ifdef CONFIG_SOFTMMU
>> -obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
>> +obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
>>  obj-y += qtest.o
>>  obj-y += hw/
>>  obj-$(CONFIG_FDT) += device_tree.o
>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>> index b1aa059..489b4b6 100644
>> --- a/include/sysemu/sysemu.h
>> +++ b/include/sysemu/sysemu.h
>> @@ -129,8 +129,11 @@ extern QEMUClockType rtc_clock;
>>  #define MAX_NODES 64
>>  #define MAX_CPUMASK_BITS 255
>>  extern int nb_numa_nodes;
>> +extern int nb_numa_mem_nodes;
>>  extern uint64_t node_mem[MAX_NODES];
>>  extern unsigned long *node_cpumask[MAX_NODES];
>> +extern QemuOptsList qemu_numa_opts;
>> +int numa_init_func(QemuOpts *opts, void *opaque);
>>  
>>  #define MAX_OPTION_ROMS 16
>>  typedef struct QEMUOptionRom {
>> diff --git a/numa.c b/numa.c
>> new file mode 100644
>> index 000..e6924f4
>> --- /dev/null
>> +++ b/numa.c
>> @@ -0,0 +1,144 @@
>> +/*
>> + * QEMU System Emulator
>> + *
>> + * Copyright (c) 2013 Fujitsu Ltd.
>> + * Author: Wanlong Gao 
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a 
>> copy
>> + * of this software and associated documentation files (the "Software"), to 
>> deal
>> + * in the Software without restriction, including without limitation the 
>> rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included 
>> in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
>> FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>> + * THE SOFTWARE.
>> + */
>> +
>> +#include "sysemu/sysemu.h"
>> +#include "qemu/bitmap.h"
>> +#include "qapi-visit.h"
>> +#include "qapi/opts-visitor.h"
>> +#include "qapi/dealloc-visitor.h"
>> +
>> +QemuOptsList qemu_numa_opts = {
>> +.name = "numa",
>> +.implied_opt_name = "type",
>> +.head = QTAILQ_HEAD_INITIALIZER(qemu_numa_opts.head),
>> +.desc = { { 0 } } /* validated with OptsVisitor */
>> +};
>> +
>> +static int numa_node_parse(NumaNodeOptions *opts)
>> +{
>> +uint16_t nodenr;
>> +uint16List *cpus = NULL;
>> +
>> +if (opts->has_nodeid) {
>> +nodenr = opts->nodeid;
>> +if (nodenr >= MAX_NODES) {
>> +fprintf(stderr, "qemu: Max number of NUMA nodes reached: %"
>> +PRIu16 "\n", nodenr);
>> +return -1;
>> +}
>> +} else {
>> +nodenr = nb_numa_nodes;
>> +}
> 
> If you make the (nodenr >= MAX_NODES) check unconditional, you simplify
> the code and you won't need the nb_numa_nodes check at numa_init_func().

Yeah, thank you.

> 
>> +
>> +for (cpus = opts->cpus; cpus; cpus = cpus->next) {
>> +bitmap_set(node_cpumask[nodenr], cpus->value, 1);
>> +}
> 
> What if cpus->value > MAXCPUMASK_BITS?

Will check it.

> 
>> +
>> +if (opts->has_mem) {
>> +int64_t mem_size;
>> +char *endptr;
>>

Re: [Qemu-devel] [PATCH v4 00/12] xics: reworks and in-kernel support

2013-09-03 Thread Alexey Kardashevskiy

On 08/30/2013 03:28 PM, Alexey Kardashevskiy wrote:
> Yet another try with XICS and XICS-KVM.
> 
> v3->v4:
> Addressed multiple comments from Alex;
> Split out many tiny patches to make them easier to review;
> Fixed xics_cpu_setup not to call the parent;
> And many, many small changes.
> 
> v2->v3:
> Addressed multiple comments from Andreas;
> Added 2 patches for XICS from Ben - I included them into the series as they
> are about XICS and they won't rebase automatically if moved before XICS rework
> so it seemed to me that it would be better to carry them toghether. If it is
> wrong, please let me know, I'll repost them separately.
> 
> v1->v2:
> The main change is this adds "xics-common" parent for emulated XICS and 
> XICS-KVM.
> And many, many small changes, mostly to address Andreas comments.
> 
> Migration from XICS to XICS-KVM and vice versa still works.
> 
> 
> Alexey Kardashevskiy (8):
>   xics: move reset and cpu_setup
>   spapr: move cpu_setup after kvmppc_set_papr
>   xics: replace fprintf with error_report
>   xics: add pre_save/post_load dispatchers
>   xics: convert init() to realize()
>   xics: add missing const specifiers to TypeInfo
>   xics: split to xics and xics-common
>   xics: add cpu_setup callback
> 
> Benjamin Herrenschmidt (2):
>   xics: Implement H_IPOLL
>   xics: Implement H_XIRR_X
> 
> David Gibson (2):
>   target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN
>   xics-kvm: Support for in-kernel XICS interrupt controller
> 
>  default-configs/ppc64-softmmu.mak |   1 +
>  hw/intc/Makefile.objs |   1 +
>  hw/intc/xics.c| 331 +-
>  hw/intc/xics_kvm.c| 488 
> ++
>  hw/ppc/spapr.c|  27 ++-
>  include/hw/ppc/spapr.h|   1 +
>  include/hw/ppc/xics.h |  57 +
>  target-ppc/kvm.c  |  14 ++
>  target-ppc/kvm_ppc.h  |   7 +
>  9 files changed, 865 insertions(+), 62 deletions(-)
>  create mode 100644 hw/intc/xics_kvm.c


Alex, ping?


-- 
Alexey

Re: [Qemu-devel] [Qemu-stable][PATCH] rdma: fix multiple VMs parallel migration

2013-09-03 Thread Frank Yang

On 2013-9-3 13:03, Lei Li wrote:
> Hi Frank,
>
> I failed to apply this patch. Please make sure to use git-send-email, 
> otherwise
> it's a little hard to review. :)
>
> On 08/30/2013 08:39 PM, Frank Yang wrote:
>> When several VMs migrate with RDMA at the same time, the increased pressure 
>> cause packet loss probabilistically and make source and destination wait for 
>> each other. There might be some of VMs blocked during the migration.
>>
>> Fix the bug by using two completion queues, for sending and receiving 
>> respectively.
>
>>
>> From 0c4829495cdc89eea2e94b103ac42c3f6a4b32c2 Mon Sep 17 00:00:00 2001
>> From: Frank Yang mailto:frank.yang...@gmail.com>>
>> Date: Fri, 30 Aug 2013 17:53:34 +0800
>> Subject: [PATCH] rdma: fix multiple VMs parallel migration
>
> The commit message should be here within the patch. You can use 'git commit 
> --amend'
> to add it.
>  
>
>>
>> Signed-off-by: Frank Yang > >
>> ---
>>  migration-rdma.c | 57 
>> 
>>  1 file changed, 37 insertions(+), 20 deletions(-)
>>
>> diff --git a/migration-rdma.c b/migration-rdma.c
>> index 3d1266f..d0eacbb 100644
>> --- a/migration-rdma.c
>> +++ b/migration-rdma.c
>> @@ -362,7 +362,8 @@ typedef struct RDMAContext {
>>  struct ibv_qp *qp;  /* queue pair */
>>  struct ibv_comp_channel *comp_channel;  /* completion channel */
>>  struct ibv_pd *pd;  /* protection domain */
>> -struct ibv_cq *cq;  /* completion queue */
>> +struct ibv_cq *send_cq; /* send completion queue */
>> +struct ibv_cq *recv_cq; /* receive completion queue */
>>  /*
>>   * If a previous write failed (perhaps because of a failed
>> @@ -1006,9 +1007,12 @@ static int qemu_rdma_alloc_pd_cq(RDMAContext *rdma)
>>   * Completion queue can be filled by both read and write work requests,
>>   * so must reflect the sum of both possible queue sizes.
>>   */
>> -rdma->cq = ibv_create_cq(rdma->verbs, (RDMA_SIGNALED_SEND_MAX * 3),
>> +rdma->send_cq = ibv_create_cq(rdma->verbs, (RDMA_SIGNALED_SEND_MAX * 2),
>>  NULL, rdma->comp_channel, 0);
>> -if (!rdma->cq) {
>> +rdma->recv_cq = ibv_create_cq(rdma->verbs, RDMA_SIGNALED_SEND_MAX, NULL,
>> +rdma->comp_channel, 0);
>> +
>> +if (!rdma->send_cq || !rdma->recv_cq) {
>>  fprintf(stderr, "failed to allocate completion queue\n");
>>  goto err_alloc_pd_cq;
>>  }
>> @@ -1040,8 +1044,8 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
>>  attr.cap.max_recv_wr = 3;
>>  attr.cap.max_send_sge = 1;
>>  attr.cap.max_recv_sge = 1;
>> -attr.send_cq = rdma->cq;
>> -attr.recv_cq = rdma->cq;
>> +attr.send_cq = rdma->send_cq;
>> +attr.recv_cq = rdma->recv_cq;
>>  attr.qp_type = IBV_QPT_RC;
>>  ret = rdma_create_qp(rdma->cm_id, rdma->pd, &attr);
>> @@ -1361,13 +1365,18 @@ static void qemu_rdma_signal_unregister(RDMAContext 
>> *rdma, uint64_t index,
>>   * Return the work request ID that completed.
>>   */
>>  static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out,
>> -   uint32_t *byte_len)
>> +   uint32_t *byte_len, int wrid_requested)
>>  {
>>  int ret;
>>  struct ibv_wc wc;
>>  uint64_t wr_id;
>> -ret = ibv_poll_cq(rdma->cq, 1, &wc);
>> +if (wrid_requested == RDMA_WRID_RDMA_WRITE ||
>> +wrid_requested == RDMA_WRID_SEND_CONTROL) {
>> +ret = ibv_poll_cq(rdma->send_cq, 1, &wc);
>> +} else if (wrid_requested >= RDMA_WRID_RECV_CONTROL) {
>> +ret = ibv_poll_cq(rdma->recv_cq, 1, &wc);
>> +}
>>  if (!ret) {
>>  *wr_id_out = RDMA_WRID_NONE;
>> @@ -1460,12 +1469,9 @@ static int qemu_rdma_block_for_wrid(RDMAContext 
>> *rdma, int wrid_requested,
>>  void *cq_ctx;
>>  uint64_t wr_id = RDMA_WRID_NONE, wr_id_in;
>> -if (ibv_req_notify_cq(rdma->cq, 0)) {
>> -return -1;
>> -}
>>  /* poll cq first */
>>  while (wr_id != wrid_requested) {
>> -ret = qemu_rdma_poll(rdma, &wr_id_in, byte_len);
>> +ret = qemu_rdma_poll(rdma, &wr_id_in, byte_len, wrid_requested);
>>  if (ret < 0) {
>>  return ret;
>>  }
>> @@ -1487,6 +1493,17 @@ static int qemu_rdma_block_for_wrid(RDMAContext 
>> *rdma, int wrid_requested,
>>  }
>>  while (1) {
>> +if (wrid_requested == RDMA_WRID_RDMA_WRITE ||
>> +wrid_requested == RDMA_WRID_SEND_CONTROL) {
>> +if (ibv_req_notify_cq(rdma->send_cq, 0)) {
>> +return -1;
>> +}
>> +} else if (wrid_requested >= RDMA_WRID_RECV_CONTROL) {
>> +if (ibv_req_notify_cq(rdma->recv_cq, 0)) {
>> +return -1;
>> +}
>> +}
>> +
>>  /*
>>   * Coroutine doesn't start until process_incoming_migration()
>>

Re: [Qemu-devel] [PATCH V11 06/11] NUMA: parse guest numa nodes memory policy

2013-09-03 Thread Wanlong Gao

On 09/04/2013 10:28 AM, Eduardo Habkost wrote:
> On Fri, Aug 30, 2013 at 11:10:45AM +0800, Wanlong Gao wrote:
>> The memory policy setting format is like:
>> 
>> policy={default|membind|interleave|preferred}[,relative=true],host-nodes=N-N
>> And we are adding this setting as a suboption of "-numa mem,",
>> the memory policy then can be set like following:
>> -numa node,nodeid=0,cpus=0 \
>> -numa node,nodeid=1,cpus=1 \
>> -numa mem,nodeid=0,size=1G,policy=membind,host-nodes=0-1 \
>> -numa mem,nodeid=1,size=1G,policy=interleave,relative=true,host-nodes=1
>>
>> Signed-off-by: Wanlong Gao 
>> ---
>>  include/sysemu/sysemu.h |  5 -
>>  numa.c  | 13 +
>>  qapi-schema.json| 33 +++--
>>  vl.c|  3 +++
>>  4 files changed, 51 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>> index e1e4320..2d04bad 100644
>> --- a/include/sysemu/sysemu.h
>> +++ b/include/sysemu/sysemu.h
>> @@ -127,13 +127,16 @@ extern size_t boot_splash_filedata_size;
>>  extern uint8_t qemu_extra_params_fw[2];
>>  extern QEMUClockType rtc_clock;
>>  
>> -#define MAX_NODES 64
>> +#define MAX_NODES 128
> 
> Can you please include this in a separate patch?

OK, thank you.

Regards,
Wanlong Gao

>

[Qemu-devel] [PATCH 2/2] rdma: simplify qemu_rdma_register_and_get_keys()

2013-09-03 Thread Isaku Yamahata

Signed-off-by: Isaku Yamahata 
---
 migration-rdma.c |   23 +++
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/migration-rdma.c b/migration-rdma.c
index db5a908..941c07e 100644
--- a/migration-rdma.c
+++ b/migration-rdma.c
@@ -1128,8 +1128,7 @@ static int qemu_rdma_search_ram_block(RDMAContext *rdma,
  */
 static int qemu_rdma_register_and_get_keys(RDMAContext *rdma,
 RDMALocalBlock *block, uint8_t *host_addr,
-uint32_t *lkey, uint32_t *rkey, int chunk,
-uint8_t *chunk_start, uint8_t *chunk_end)
+uint32_t *lkey, uint32_t *rkey, int chunk)
 {
 if (block->mr) {
 if (lkey) {
@@ -1155,6 +1154,8 @@ static int qemu_rdma_register_and_get_keys(RDMAContext 
*rdma,
  * If 'lkey', then we're the source VM, so grant access only to ourselves.
  */
 if (!block->pmr[chunk]) {
+uint8_t *chunk_start = ram_chunk_start(block, chunk);
+uint8_t *chunk_end = ram_chunk_end(block, chunk);
 uint64_t len = chunk_end - chunk_start;
 
 DDPRINTF("Registering %" PRIu64 " bytes @ %p\n",
@@ -1849,7 +1850,6 @@ static int qemu_rdma_write_one(QEMUFile *f, RDMAContext 
*rdma,
 struct ibv_send_wr *bad_wr;
 int reg_result_idx, ret, count = 0;
 uint64_t chunk, chunks;
-uint8_t *chunk_start, *chunk_end;
 RDMALocalBlock *block = &(rdma->local_ram_blocks.block[current_index]);
 RDMARegister reg;
 RDMARegisterResult *reg_result;
@@ -1865,7 +1865,6 @@ retry:
 sge.length = length;
 
 chunk = ram_chunk_index(block->local_host_addr, (uint8_t *) sge.addr);
-chunk_start = ram_chunk_start(block, chunk);
 
 if (block->is_ram_block) {
 chunks = length / (1UL << RDMA_REG_CHUNK_SHIFT);
@@ -1884,8 +1883,6 @@ retry:
 DDPRINTF("Writing %" PRIu64 " chunks, (%" PRIu64 " MB)\n",
 chunks + 1, (chunks + 1) * (1UL << RDMA_REG_CHUNK_SHIFT) / 1024 / 
1024);
 
-chunk_end = ram_chunk_end(block, chunk + chunks);
-
 if (!rdma->pin_all) {
 #ifdef RDMA_UNREGISTRATION_EXAMPLE
 qemu_rdma_unregister_waiting(rdma);
@@ -1974,8 +1971,7 @@ retry:
 /* try to overlap this single registration with the one we sent. */
 if (qemu_rdma_register_and_get_keys(rdma, block,
 (uint8_t *) sge.addr,
-&sge.lkey, NULL, chunk,
-chunk_start, chunk_end)) {
+&sge.lkey, NULL, chunk)) {
 fprintf(stderr, "cannot get lkey!\n");
 return -EINVAL;
 }
@@ -1995,8 +1991,7 @@ retry:
 /* already registered before */
 if (qemu_rdma_register_and_get_keys(rdma, block,
 (uint8_t *)sge.addr,
-&sge.lkey, NULL, chunk,
-chunk_start, chunk_end)) {
+&sge.lkey, NULL, chunk)) {
 fprintf(stderr, "cannot get lkey!\n");
 return -EINVAL;
 }
@@ -2007,8 +2002,7 @@ retry:
 send_wr.wr.rdma.rkey = block->remote_rkey;
 
 if (qemu_rdma_register_and_get_keys(rdma, block, (uint8_t *)sge.addr,
- &sge.lkey, NULL, chunk,
- chunk_start, chunk_end)) {
+ &sge.lkey, NULL, chunk)) {
 fprintf(stderr, "cannot get lkey!\n");
 return -EINVAL;
 }
@@ -3054,7 +3048,6 @@ static int qemu_rdma_registration_handle(QEMUFile *f, 
void *opaque,
 
 for (count = 0; count < head.repeat; count++) {
 uint64_t chunk;
-uint8_t *chunk_start, *chunk_end;
 
 reg = ®isters[count];
 network_to_register(reg);
@@ -3076,11 +3069,9 @@ static int qemu_rdma_registration_handle(QEMUFile *f, 
void *opaque,
 host_addr = block->local_host_addr +
 (reg->key.chunk * (1UL << RDMA_REG_CHUNK_SHIFT));
 }
-chunk_start = ram_chunk_start(block, chunk);
-chunk_end = ram_chunk_end(block, chunk + reg->chunks);
 if (qemu_rdma_register_and_get_keys(rdma, block,
 (uint8_t *)host_addr, NULL, ®_result->rkey,
-chunk, chunk_start, chunk_end)) {
+chunk)) {
 fprintf(stderr, "cannot get rkey!\n");
 ret = -EINVAL;
 goto out;
-- 
1.7.10.4

[Qemu-devel] [PATCH 1/2] rdma: constify ram_chunk_{index, start, end}

2013-09-03 Thread Isaku Yamahata

Signed-off-by: Isaku Yamahata 
---
 migration-rdma.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/migration-rdma.c b/migration-rdma.c
index e71c10a..db5a908 100644
--- a/migration-rdma.c
+++ b/migration-rdma.c
@@ -511,19 +511,21 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma, 
RDMAControlHeader *head,
int *resp_idx,
int (*callback)(RDMAContext *rdma));
 
-static inline uint64_t ram_chunk_index(uint8_t *start, uint8_t *host)
+static inline uint64_t ram_chunk_index(const uint8_t *start,
+   const uint8_t *host)
 {
 return ((uintptr_t) host - (uintptr_t) start) >> RDMA_REG_CHUNK_SHIFT;
 }
 
-static inline uint8_t *ram_chunk_start(RDMALocalBlock *rdma_ram_block,
+static inline uint8_t *ram_chunk_start(const RDMALocalBlock *rdma_ram_block,
uint64_t i)
 {
 return (uint8_t *) (((uintptr_t) rdma_ram_block->local_host_addr)
 + (i << RDMA_REG_CHUNK_SHIFT));
 }
 
-static inline uint8_t *ram_chunk_end(RDMALocalBlock *rdma_ram_block, uint64_t 
i)
+static inline uint8_t *ram_chunk_end(const RDMALocalBlock *rdma_ram_block,
+ uint64_t i)
 {
 uint8_t *result = ram_chunk_start(rdma_ram_block, i) +
  (1UL << RDMA_REG_CHUNK_SHIFT);
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH V11 06/11] NUMA: parse guest numa nodes memory policy

2013-09-03 Thread Eduardo Habkost

On Fri, Aug 30, 2013 at 11:10:45AM +0800, Wanlong Gao wrote:
> The memory policy setting format is like:
> 
> policy={default|membind|interleave|preferred}[,relative=true],host-nodes=N-N
> And we are adding this setting as a suboption of "-numa mem,",
> the memory policy then can be set like following:
> -numa node,nodeid=0,cpus=0 \
> -numa node,nodeid=1,cpus=1 \
> -numa mem,nodeid=0,size=1G,policy=membind,host-nodes=0-1 \
> -numa mem,nodeid=1,size=1G,policy=interleave,relative=true,host-nodes=1
> 
> Signed-off-by: Wanlong Gao 
> ---
>  include/sysemu/sysemu.h |  5 -
>  numa.c  | 13 +
>  qapi-schema.json| 33 +++--
>  vl.c|  3 +++
>  4 files changed, 51 insertions(+), 3 deletions(-)
> 
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index e1e4320..2d04bad 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -127,13 +127,16 @@ extern size_t boot_splash_filedata_size;
>  extern uint8_t qemu_extra_params_fw[2];
>  extern QEMUClockType rtc_clock;
>  
> -#define MAX_NODES 64
> +#define MAX_NODES 128

Can you please include this in a separate patch?

-- 
Eduardo

Re: [Qemu-devel] [PATCH V11 02/11] NUMA: split -numa option

2013-09-03 Thread Eduardo Habkost

On Fri, Aug 30, 2013 at 11:10:41AM +0800, Wanlong Gao wrote:
> Change -numa option like following as Paolo suggested:
> -numa node,nodeid=0,cpus=0-1 \
> -numa mem,nodeid=0,size=1G
> 
> This new option will make later coming memory hotplug better.
> And this new option is implemented using OptsVisitor.
> 
> And just remain "-numa node,mem=xx" as legacy.
> 
> Reviewed-by: Laszlo Ersek 
> Signed-off-by: Wanlong Gao 

Would it be possible to first move the existing code as-is to numa.c,
then introduce qemu_numa_opts, and then introduce "-numa mem"? It would
make the patch much easier to review.

> ---
>  Makefile.target |   2 +-
>  include/sysemu/sysemu.h |   3 +
>  numa.c  | 144 
> 
>  qemu-options.hx |   6 +-
>  vl.c| 113 ++---
>  5 files changed, 168 insertions(+), 100 deletions(-)
>  create mode 100644 numa.c
> 
> diff --git a/Makefile.target b/Makefile.target
> index 9a49852..7e1fddf 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -113,7 +113,7 @@ endif #CONFIG_BSD_USER
>  #
>  # System emulator target
>  ifdef CONFIG_SOFTMMU
> -obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
> +obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
>  obj-y += qtest.o
>  obj-y += hw/
>  obj-$(CONFIG_FDT) += device_tree.o
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index b1aa059..489b4b6 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -129,8 +129,11 @@ extern QEMUClockType rtc_clock;
>  #define MAX_NODES 64
>  #define MAX_CPUMASK_BITS 255
>  extern int nb_numa_nodes;
> +extern int nb_numa_mem_nodes;
>  extern uint64_t node_mem[MAX_NODES];
>  extern unsigned long *node_cpumask[MAX_NODES];
> +extern QemuOptsList qemu_numa_opts;
> +int numa_init_func(QemuOpts *opts, void *opaque);
>  
>  #define MAX_OPTION_ROMS 16
>  typedef struct QEMUOptionRom {
> diff --git a/numa.c b/numa.c
> new file mode 100644
> index 000..e6924f4
> --- /dev/null
> +++ b/numa.c
> @@ -0,0 +1,144 @@
> +/*
> + * QEMU System Emulator
> + *
> + * Copyright (c) 2013 Fujitsu Ltd.
> + * Author: Wanlong Gao 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "sysemu/sysemu.h"
> +#include "qemu/bitmap.h"
> +#include "qapi-visit.h"
> +#include "qapi/opts-visitor.h"
> +#include "qapi/dealloc-visitor.h"
> +
> +QemuOptsList qemu_numa_opts = {
> +.name = "numa",
> +.implied_opt_name = "type",
> +.head = QTAILQ_HEAD_INITIALIZER(qemu_numa_opts.head),
> +.desc = { { 0 } } /* validated with OptsVisitor */
> +};
> +
> +static int numa_node_parse(NumaNodeOptions *opts)
> +{
> +uint16_t nodenr;
> +uint16List *cpus = NULL;
> +
> +if (opts->has_nodeid) {
> +nodenr = opts->nodeid;
> +if (nodenr >= MAX_NODES) {
> +fprintf(stderr, "qemu: Max number of NUMA nodes reached: %"
> +PRIu16 "\n", nodenr);
> +return -1;
> +}
> +} else {
> +nodenr = nb_numa_nodes;
> +}

If you make the (nodenr >= MAX_NODES) check unconditional, you simplify
the code and you won't need the nb_numa_nodes check at numa_init_func().

> +
> +for (cpus = opts->cpus; cpus; cpus = cpus->next) {
> +bitmap_set(node_cpumask[nodenr], cpus->value, 1);
> +}

What if cpus->value > MAXCPUMASK_BITS?

> +
> +if (opts->has_mem) {
> +int64_t mem_size;
> +char *endptr;
> +mem_size = strtosz(opts->mem, &endptr);
> +if (mem_size < 0 || *endptr) {
> +fprintf(stderr, "qemu: invalid numa mem size: %s\n", opts->mem);
> +return -1;
> +}
> +node_mem[nodenr] = mem_size;
> +}
> +
> +return 0;
> +}
> +
> +static int numa_mem_parse(NumaMemOp

Re: [Qemu-devel] [Qemu-trivial] [PATCH] cputlb: remove dead function tlb_update_dirty

2013-09-03 Thread Li Guang

在 2013-09-03二的 18:54 +0200，Andreas Färber写道：
> Am 03.09.2013 13:17, schrieb Michael Tokarev:
> > 03.09.2013 12:35, Andreas Färber wrote:
> >> I also don't understand why qemu-trivial is suddenly picking up Stefan's
> >> arm translation patch, it used to be for unmaintained areas only. But
> >> arm is not my problem.
> > 
> > Which patch you're talking about?  Is it "target-arm: Report unimplemented
> > opcodes (LOG_UNIMP)" ?
> 
> Yes.
> 
> >  If yes, that one appears to be trivial as it just
> > adds some logging before failing an instruction and should not conflict
> > with other work being done in this area.  Perhaps I was too aggressive
> > while picking up the backlog.  We should just draw the line *somewhere*, --
> 
> Right, that line is what I'm reminding about here. I feel that lately an
> increasing number of contributors and reviewers are deferring patches to
> qemu-trivial that don't really belong there IMO. That Anthony doesn't
> scale to cover Blue's maintainer work as well shouldn't lead to a surge
> on qemu-trivial.
> 
> > eg, it sure is possible to reject spelling fixes for maintained areas
> > from -trivial (like this arm tree), - will this be productive?
> 
> No, spelling fixes are not a concern to me as they are rather unlikely
> to cause conflicts with patches being queued by submaintainers. :)
> 
> > This change (cputlb: remove dead function) appears to be "trivial enough"
> > for me (after looking at the usage history of this function), and I'd
> > pick it up without this Andreas's request, too.
> 
> Yes. This one here would've been okay usually, as there is no official
> maintainer for cputlb.c and it's trivial in the sense that a git-grep
> confirms it to be okay. I was just annoyed that I had to defer my pull
> twice (sent it out now) because s390x added two CPU loops and then once
> that was merged ppc added another loop, too. My upcoming 35+ patch
> series qom-cpu-13 may hopefully explain the rest once you see it.
> 
> > As for the "suddenly" - it's not really suddenly, it's because it
> > has been Cc'd to -trivial (by someone who submitted lots of good
> > trivial patches before) and actually looks trivial, too.  And also
> > because subsystem maintainer added his Reviewed-by, apparently (or
> > hopefully) after noticing it's submitted to -trivial.  I also Cc'd
> > both maintainers in my notice that it's been applied to -trivial.
> 
> "Suddenly" in the sense that the prupose of qemu-trivial used to be
> handling patches that would otherwise fall through the cracks.
> 
> So by my understanding, e.g., "target-arm:" => !trivial, and I would've
> expected there to be some on-list communication between PMM and you
> before CC'ing someone on a "thanks, applied" after the fact.
> By contrast, if there's a change to configure or "Fix spelling of" etc.
> then you picking it up is highly appreciated. I just don't want
> qemu-trivial becoming the least-resistance way of getting patches into
> qemu.git that might otherwise get bounced/changed by submaintainers.
> 
> Also, I am seeing Paolo pull in huge memory changes but now pinging the
> breakage fixes rather than assembling a pull to fix the fallout. ;)
> 
> Similarly target-i386 TCG is not suited for qemu-trivial IMO, instead
> rth or someone who works on and/or reviews it (rth?) should volunteer as
> proper maintainer. 

 I'd like to maintain cputlb.c, can I?

> With the larger part of the community using KVM these
> days, we simply can't have that be handled by the community at large any
> more.
> 
> So yes, I know you were on vacation and you seem eager to take up work
> again, that's great; I'm just cautioning that CC'ing everything on
> qemu-trivial (not your fault, you're on the receiving end) can't be the
> new solution, so feel encouraged to push back a little. :)
> 
> Cheers,
> Andreas
> 

-- 
Thanks!

Li Guang

Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-03 Thread Alexander Graf


On 04.09.2013, at 03:13, Alexey Kardashevskiy wrote:

> On 09/03/2013 07:22 PM, Andreas Färber wrote:
>> Am 03.09.2013 11:07, schrieb Alexey Kardashevskiy:
>>> On 09/03/2013 06:42 PM, Andreas Färber wrote:
 Am 03.09.2013 09:31, schrieb Alexey Kardashevskiy:
> diff --git a/target-ppc/machine.c b/target-ppc/machine.c
> index 12e1512..d1ffc7f 100644
> --- a/target-ppc/machine.c
> +++ b/target-ppc/machine.c
>> [...]
> +static const VMStateDescription vmstate_timebase = {
> +.name = "cpu/timebase",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.minimum_version_id_old = 1,
> +.pre_save = timebase_pre_save,
> +.post_load = timebase_post_load,
> +.fields  = (VMStateField []) {
> +VMSTATE_UINT64(timebase, ppc_tb_t),
> +VMSTATE_INT64(tb_offset, ppc_tb_t),
> +VMSTATE_UINT64(time_of_the_day, ppc_tb_t),
> +VMSTATE_UINT32_EQUAL(tb_freq, ppc_tb_t),
> +VMSTATE_END_OF_LIST()
> +},
> +};
> +
> const VMStateDescription vmstate_ppc_cpu = {
> .name = "cpu",
> .version_id = 5,
> @@ -498,6 +538,10 @@ const VMStateDescription vmstate_ppc_cpu = {
> VMSTATE_UINT64_EQUAL(env.insns_flags, PowerPCCPU),
> VMSTATE_UINT64_EQUAL(env.insns_flags2, PowerPCCPU),
> VMSTATE_UINT32_EQUAL(env.nb_BATs, PowerPCCPU),
> +
> +/* Time offset */
> +VMSTATE_STRUCT_POINTER(env.tb_env, PowerPCCPU,
> +   vmstate_timebase, ppc_tb_t *),
> VMSTATE_END_OF_LIST()
> },
> .subsections = (VMStateSubsection []) {
 
 Breaks the migration format. ;) You need to bump version_id and use a
 macro that accepts the version the field was added in as argument.
>>> 
>>> Risking of being called ignorant, I'll still ask - is the patch below what
>>> you mean? I could not find VMSTATE_STRUCT_POINTER_V and I do not believe it
>>> is not there already.
>> 
>> Usually the way we do it is to have VMSTATE_STRUCT_POINTER() call
>> VMSTATE_STRUCT_POINTER_V() and thus VMSTATE_STRUCT_POINTER_TEST() call a
>> new VMSTATE_STRUCT_POINTER_TEST_V(), to avoid code duplication of the
>> core array entry. CC'ing Juan. Please do the VMState preparation in a
>> separate patch.
>> 
>> ppc usage looks fine.
>> 
>>> btw why would it break? Just asking. Is it because the source may send what
>>> the destination cannot handle? Named fields should stop the migration the
>>> same way as version mismatch would have done.
>> 
>> Nope, field names do not get transmitted, only the section names.
>> (Otherwise random code refactorings could break the migration format.)
>> 
>>> Or the source won't sent what the destination expects and we do not want
>>> this destination guest to continue?
>> 
>> There's an incoming stream of data from either live migration or a file,
>> and QEMU must decide whether it can read and how to interpret the raw
>> bytestream. It shouldn't just read random bytes into a new field when
>> they were written from a different field.
>> 
>>> Once I was told that migration between different versions of QEMU is not
>>> supported - so what is the point of .version_id field at all?
>> 
>> Not sure who told such a thing and in what context. On x86 we try to
>> avoid version_id bumps by adding subsections to allow migration in both
>> ways (including from newer to older) but for ppc, arm and all others we
>> do require that new fields are marked as such. Whether migration is
>> officially supported is a different matter from the VMState wire format.
> 
> 
> Why is the approach different for x86 and ppc here? I can convert
> VMSTATE_STRUCT_POINTER to a subsection, why should not I? Or ppc is not
> mature enough and therefore there is no need to keep compatibility? Thanks.

Just bump the version.


Alex

Re: [Qemu-devel] [RFC PATCH] spapr: support time base offset migration

2013-09-03 Thread Alexey Kardashevskiy

On 09/03/2013 07:22 PM, Andreas Färber wrote:
> Am 03.09.2013 11:07, schrieb Alexey Kardashevskiy:
>> On 09/03/2013 06:42 PM, Andreas Färber wrote:
>>> Am 03.09.2013 09:31, schrieb Alexey Kardashevskiy:
 diff --git a/target-ppc/machine.c b/target-ppc/machine.c
 index 12e1512..d1ffc7f 100644
 --- a/target-ppc/machine.c
 +++ b/target-ppc/machine.c
> [...]
 +static const VMStateDescription vmstate_timebase = {
 +.name = "cpu/timebase",
 +.version_id = 1,
 +.minimum_version_id = 1,
 +.minimum_version_id_old = 1,
 +.pre_save = timebase_pre_save,
 +.post_load = timebase_post_load,
 +.fields  = (VMStateField []) {
 +VMSTATE_UINT64(timebase, ppc_tb_t),
 +VMSTATE_INT64(tb_offset, ppc_tb_t),
 +VMSTATE_UINT64(time_of_the_day, ppc_tb_t),
 +VMSTATE_UINT32_EQUAL(tb_freq, ppc_tb_t),
 +VMSTATE_END_OF_LIST()
 +},
 +};
 +
  const VMStateDescription vmstate_ppc_cpu = {
  .name = "cpu",
  .version_id = 5,
 @@ -498,6 +538,10 @@ const VMStateDescription vmstate_ppc_cpu = {
  VMSTATE_UINT64_EQUAL(env.insns_flags, PowerPCCPU),
  VMSTATE_UINT64_EQUAL(env.insns_flags2, PowerPCCPU),
  VMSTATE_UINT32_EQUAL(env.nb_BATs, PowerPCCPU),
 +
 +/* Time offset */
 +VMSTATE_STRUCT_POINTER(env.tb_env, PowerPCCPU,
 +   vmstate_timebase, ppc_tb_t *),
  VMSTATE_END_OF_LIST()
  },
  .subsections = (VMStateSubsection []) {
>>>
>>> Breaks the migration format. ;) You need to bump version_id and use a
>>> macro that accepts the version the field was added in as argument.
>>
>> Risking of being called ignorant, I'll still ask - is the patch below what
>> you mean? I could not find VMSTATE_STRUCT_POINTER_V and I do not believe it
>> is not there already.
> 
> Usually the way we do it is to have VMSTATE_STRUCT_POINTER() call
> VMSTATE_STRUCT_POINTER_V() and thus VMSTATE_STRUCT_POINTER_TEST() call a
> new VMSTATE_STRUCT_POINTER_TEST_V(), to avoid code duplication of the
> core array entry. CC'ing Juan. Please do the VMState preparation in a
> separate patch.
>
> ppc usage looks fine.
> 
>> btw why would it break? Just asking. Is it because the source may send what
>> the destination cannot handle? Named fields should stop the migration the
>> same way as version mismatch would have done.
> 
> Nope, field names do not get transmitted, only the section names.
> (Otherwise random code refactorings could break the migration format.)
> 
>> Or the source won't sent what the destination expects and we do not want
>> this destination guest to continue?
> 
> There's an incoming stream of data from either live migration or a file,
> and QEMU must decide whether it can read and how to interpret the raw
> bytestream. It shouldn't just read random bytes into a new field when
> they were written from a different field.
> 
>> Once I was told that migration between different versions of QEMU is not
>> supported - so what is the point of .version_id field at all?
> 
> Not sure who told such a thing and in what context. On x86 we try to
> avoid version_id bumps by adding subsections to allow migration in both
> ways (including from newer to older) but for ppc, arm and all others we
> do require that new fields are marked as such. Whether migration is
> officially supported is a different matter from the VMState wire format.


Why is the approach different for x86 and ppc here? I can convert
VMSTATE_STRUCT_POINTER to a subsection, why should not I? Or ppc is not
mature enough and therefore there is no need to keep compatibility? Thanks.


> 
> Regards,
> Andreas
> 
> 
>> alexey@ka1:~/pcipassthru/qemu$ git diff
>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>> index 1c31b5d..7b624bf 100644
>> --- a/include/migration/vmstate.h
>> +++ b/include/migration/vmstate.h
>> @@ -499,6 +499,15 @@ extern const VMStateInfo vmstate_info_bitmap;
>>  #define VMSTATE_STRUCT_POINTER(_field, _state, _vmsd, _type)  \
>>  VMSTATE_STRUCT_POINTER_TEST(_field, _state, NULL, _vmsd, _type)
>>
>> +#define VMSTATE_STRUCT_POINTER_V(_field, _state,  _vmsd, _type, _version) { 
>> \
>> +.name = (stringify(_field)), \
>> +.version_id = (_version),\
>> +.vmsd = &(_vmsd),\
>> +.size = sizeof(_type),   \
>> +.flags= VMS_STRUCT|VMS_POINTER,  \
>> +.offset   = vmstate_offset_value(_state, _field, _type), \
>> +}
>> +
>>  #define VMSTATE_STRUCT_ARRAY(_field, _state, _num, _version, _vmsd, _type) \
>>  VMSTATE_STRUCT_ARRAY_TEST(_field, _state, _num, NULL, _version,   \
>>  _vmsd, _type)
>> diff --git a/target-ppc/machine.c b/target-ppc/ma

Re: [Qemu-devel] [PATCH] exec: avoid tcg_commit when kvm_enabled

2013-09-03 Thread Li Guang

在 2013-09-03二的 10:39 +0200，Andreas Färber写道：
> Am 03.09.2013 08:59, schrieb liguang:
> > Signed-off-by: liguang 
> > ---
> >  exec.c |4 +++-
> >  1 files changed, 3 insertions(+), 1 deletions(-)
> > 
> > diff --git a/exec.c b/exec.c
> > index 3ca9381..4509daa 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -1824,7 +1824,9 @@ static void memory_map_init(void)
> >  address_space_init(&address_space_io, system_io, "I/O");
> >  
> >  memory_listener_register(&core_memory_listener, &address_space_memory);
> > -memory_listener_register(&tcg_memory_listener, &address_space_memory);
> > +if (!kvm_enabled()) {
> 
> if (tcg_enabled())? I'm guessing Xen and QTest don't need it either?
> 

can't assure that currently, 
anybody can help to assure whether Xen & QTest need tcg_commit?

> 
> > +memory_listener_register(&tcg_memory_listener, 
> > &address_space_memory);
> > +}
> >  }
> >  
> >  MemoryRegion *get_system_memory(void)
> 

-- 
Thanks!

Li Guang

[Qemu-devel] [PATCH v2 05/10] raven: set a correct PCI I/O memory region

2013-09-03 Thread Hervé Poussineau

PCI I/O region is 0x3f80 bytes starting at 0x8000.
Do not use global QEMU I/O region, which is only 64KB.

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |   15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index 95fa2ea..af0bf2b 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -53,6 +53,7 @@ typedef struct PRePPCIState {
 
 qemu_irq irq[PCI_NUM_PINS];
 PCIBus pci_bus;
+MemoryRegion pci_io;
 MemoryRegion pci_intack;
 RavenPCIState pci_dev;
 } PREPPCIState;
@@ -136,13 +137,11 @@ static void raven_pcihost_realizefn(DeviceState *d, Error 
**errp)
 
 memory_region_init_io(&h->conf_mem, OBJECT(h), &pci_host_conf_be_ops, s,
   "pci-conf-idx", 1);
-sysbus_add_io(dev, 0xcf8, &h->conf_mem);
-sysbus_init_ioports(&h->busdev, 0xcf8, 1);
+memory_region_add_subregion(&s->pci_io, 0xcf8, &h->conf_mem);
 
 memory_region_init_io(&h->data_mem, OBJECT(h), &pci_host_data_be_ops, s,
   "pci-conf-data", 1);
-sysbus_add_io(dev, 0xcfc, &h->data_mem);
-sysbus_init_ioports(&h->busdev, 0xcfc, 1);
+memory_region_add_subregion(&s->pci_io, 0xcfc, &h->data_mem);
 
 memory_region_init_io(&h->mmcfg, OBJECT(s), &PPC_PCIIO_ops, s, "pciio", 
0x0040);
 memory_region_add_subregion(address_space_mem, 0x8080, &h->mmcfg);
@@ -160,11 +159,15 @@ static void raven_pcihost_initfn(Object *obj)
 PCIHostState *h = PCI_HOST_BRIDGE(obj);
 PREPPCIState *s = RAVEN_PCI_HOST_BRIDGE(obj);
 MemoryRegion *address_space_mem = get_system_memory();
-MemoryRegion *address_space_io = get_system_io();
 DeviceState *pci_dev;
 
+memory_region_init(&s->pci_io, obj, "pci-io", 0x3f80);
+
+/* CPU address space */
+memory_region_add_subregion(address_space_mem, 0x8000, &s->pci_io);
 pci_bus_new_inplace(&s->pci_bus, DEVICE(obj), NULL,
-address_space_mem, address_space_io, 0, TYPE_PCI_BUS);
+address_space_mem, &s->pci_io, 0, TYPE_PCI_BUS);
+
 h->bus = &s->pci_bus;
 
 object_initialize(&s->pci_dev, TYPE_RAVEN_PCI_DEVICE);
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 06/10] raven: set a correct PCI memory region

2013-09-03 Thread Hervé Poussineau

PCI memory region is 0x3f00 bytes starting at 0xc000.

However, keep compatibility with Open Hack'Ware expectations
by adding a hack for Open Hack'Ware display.

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |9 ++---
 hw/ppc/prep.c  |9 +
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index af0bf2b..bba76af 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -54,6 +54,7 @@ typedef struct PRePPCIState {
 qemu_irq irq[PCI_NUM_PINS];
 PCIBus pci_bus;
 MemoryRegion pci_io;
+MemoryRegion pci_memory;
 MemoryRegion pci_intack;
 RavenPCIState pci_dev;
 } PREPPCIState;
@@ -127,8 +128,6 @@ static void raven_pcihost_realizefn(DeviceState *d, Error 
**errp)
 MemoryRegion *address_space_mem = get_system_memory();
 int i;
 
-isa_mem_base = 0xc000;
-
 for (i = 0; i < PCI_NUM_PINS; i++) {
 sysbus_init_irq(dev, &s->irq[i]);
 }
@@ -162,11 +161,15 @@ static void raven_pcihost_initfn(Object *obj)
 DeviceState *pci_dev;
 
 memory_region_init(&s->pci_io, obj, "pci-io", 0x3f80);
+/* Open Hack'Ware hack: real size should be only 0x3f00 bytes */
+memory_region_init(&s->pci_memory, obj, "pci-memory",
+   0x3f00 + 0xc000ULL);
 
 /* CPU address space */
 memory_region_add_subregion(address_space_mem, 0x8000, &s->pci_io);
+memory_region_add_subregion(address_space_mem, 0xc000, &s->pci_memory);
 pci_bus_new_inplace(&s->pci_bus, DEVICE(obj), NULL,
-address_space_mem, &s->pci_io, 0, TYPE_PCI_BUS);
+&s->pci_memory, &s->pci_io, 0, TYPE_PCI_BUS);
 
 h->bus = &s->pci_bus;
 
diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index 6df4324..e75c4f0 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -466,6 +466,7 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
 int linux_boot, i, nb_nics1;
 MemoryRegion *ram = g_new(MemoryRegion, 1);
 MemoryRegion *bios = g_new(MemoryRegion, 1);
+MemoryRegion *vga = g_new(MemoryRegion, 1);
 uint32_t kernel_base, initrd_base;
 long kernel_size, initrd_size;
 DeviceState *dev;
@@ -604,6 +605,14 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
 
 /* init basic PC hardware */
 pci_vga_init(pci_bus);
+/* Open Hack'Ware hack: PCI BAR#0 is programmed to 0xf000.
+ * While bios will access framebuffer at 0xf000, real physical
+ * address is 0xf000 + 0xc000 (PCI memory base).
+ * Alias the wrong memory accesses to the right place.
+ */
+memory_region_init_alias(vga, NULL, "vga-alias", pci_address_space(pci),
+ 0xf000, 0x100);
+memory_region_add_subregion_overlap(sysmem, 0xf000, vga, 10);
 
 nb_nics1 = nb_nics;
 if (nb_nics1 > NE2000_NB_MAX)
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 10/10] raven: use raven_ for all function prefixes

2013-09-03 Thread Hervé Poussineau

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |   40 +---
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index 38df10c..0de835a 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -69,7 +69,7 @@ typedef struct PRePPCIState {
 
 #define BIOS_SIZE (1024 * 1024)
 
-static inline uint32_t PPC_PCIIO_config(hwaddr addr)
+static inline uint32_t raven_pci_io_config(hwaddr addr)
 {
 int i;
 
@@ -81,36 +81,36 @@ static inline uint32_t PPC_PCIIO_config(hwaddr addr)
 return (addr & 0x7ff) |  (i << 11);
 }
 
-static void ppc_pci_io_write(void *opaque, hwaddr addr,
- uint64_t val, unsigned int size)
+static void raven_pci_io_write(void *opaque, hwaddr addr,
+   uint64_t val, unsigned int size)
 {
 PREPPCIState *s = opaque;
 PCIHostState *phb = PCI_HOST_BRIDGE(s);
-pci_data_write(phb->bus, PPC_PCIIO_config(addr), val, size);
+pci_data_write(phb->bus, raven_pci_io_config(addr), val, size);
 }
 
-static uint64_t ppc_pci_io_read(void *opaque, hwaddr addr,
-unsigned int size)
+static uint64_t raven_pci_io_read(void *opaque, hwaddr addr,
+  unsigned int size)
 {
 PREPPCIState *s = opaque;
 PCIHostState *phb = PCI_HOST_BRIDGE(s);
-return pci_data_read(phb->bus, PPC_PCIIO_config(addr), size);
+return pci_data_read(phb->bus, raven_pci_io_config(addr), size);
 }
 
-static const MemoryRegionOps PPC_PCIIO_ops = {
-.read = ppc_pci_io_read,
-.write = ppc_pci_io_write,
+static const MemoryRegionOps raven_pci_io_ops = {
+.read = raven_pci_io_read,
+.write = raven_pci_io_write,
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
-static uint64_t ppc_intack_read(void *opaque, hwaddr addr,
-unsigned int size)
+static uint64_t raven_intack_read(void *opaque, hwaddr addr,
+  unsigned int size)
 {
 return pic_read_irq(isa_pic);
 }
 
-static const MemoryRegionOps PPC_intack_ops = {
-.read = ppc_intack_read,
+static const MemoryRegionOps raven_intack_ops = {
+.read = raven_intack_read,
 .valid = {
 .max_access_size = 1,
 },
@@ -181,12 +181,12 @@ static const MemoryRegionOps raven_io_ops = {
 .valid.unaligned = true,
 };
 
-static int prep_map_irq(PCIDevice *pci_dev, int irq_num)
+static int raven_map_irq(PCIDevice *pci_dev, int irq_num)
 {
 return (irq_num + (pci_dev->devfn >> 3)) & 1;
 }
 
-static void prep_set_irq(void *opaque, int irq_num, int level)
+static void raven_set_irq(void *opaque, int irq_num, int level)
 {
 qemu_irq *pic = opaque;
 
@@ -220,7 +220,8 @@ static void raven_pcihost_realizefn(DeviceState *d, Error 
**errp)
 
 qdev_init_gpio_in(d, raven_change_gpio, 1);
 
-pci_bus_irqs(&s->pci_bus, prep_set_irq, prep_map_irq, s->irq, 
PCI_NUM_PINS);
+pci_bus_irqs(&s->pci_bus, raven_set_irq, raven_map_irq, s->irq,
+ PCI_NUM_PINS);
 
 memory_region_init_io(&h->conf_mem, OBJECT(h), &pci_host_conf_le_ops, s,
   "pci-conf-idx", 4);
@@ -230,10 +231,11 @@ static void raven_pcihost_realizefn(DeviceState *d, Error 
**errp)
   "pci-conf-data", 4);
 memory_region_add_subregion(&s->pci_io, 0xcfc, &h->data_mem);
 
-memory_region_init_io(&h->mmcfg, OBJECT(s), &PPC_PCIIO_ops, s, "pciio", 
0x0040);
+memory_region_init_io(&h->mmcfg, OBJECT(s), &raven_pci_io_ops, s,
+  "pciio", 0x0040);
 memory_region_add_subregion(address_space_mem, 0x8080, &h->mmcfg);
 
-memory_region_init_io(&s->pci_intack, OBJECT(s), &PPC_intack_ops, s,
+memory_region_init_io(&s->pci_intack, OBJECT(s), &raven_intack_ops, s,
   "pci-intack", 1);
 memory_region_add_subregion(address_space_mem, 0xbff0, &s->pci_intack);
 
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 08/10] raven: implement non-contiguous I/O region

2013-09-03 Thread Hervé Poussineau

Remove now duplicated code from prep board.

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |   82 +
 hw/ppc/prep.c  |   94 ++--
 2 files changed, 85 insertions(+), 91 deletions(-)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index 3baf66f..db03adc 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -53,7 +53,9 @@ typedef struct PRePPCIState {
 
 qemu_irq irq[PCI_NUM_PINS];
 PCIBus pci_bus;
+AddressSpace pci_io_as;
 MemoryRegion pci_io;
+MemoryRegion pci_io_non_contiguous;
 MemoryRegion pci_memory;
 MemoryRegion pci_intack;
 MemoryRegion bm;
@@ -61,6 +63,8 @@ typedef struct PRePPCIState {
 MemoryRegion bm_pci_memory_alias;
 AddressSpace bm_as;
 RavenPCIState pci_dev;
+
+int contiguous_map;
 } PREPPCIState;
 
 #define BIOS_SIZE (1024 * 1024)
@@ -112,6 +116,71 @@ static const MemoryRegionOps PPC_intack_ops = {
 },
 };
 
+static inline hwaddr raven_io_address(PREPPCIState *s,
+  hwaddr addr)
+{
+if (s->contiguous_map == 0) {
+/* 64 KB contiguous space for IOs */
+addr &= 0x;
+} else {
+/* 8 MB non-contiguous space for IOs */
+addr = (addr & 0x1F) | ((addr & 0x007FFF000) >> 7);
+}
+
+/* FIXME: handle endianness switch */
+
+return addr;
+}
+
+static uint64_t raven_io_read(void *opaque, hwaddr addr,
+  unsigned int size)
+{
+PREPPCIState *s = opaque;
+uint8_t buf[4];
+
+addr = raven_io_address(s, addr);
+address_space_read(&s->pci_io_as, addr + 0x8000, buf, size);
+
+if (size == 1) {
+return buf[0];
+} else if (size == 2) {
+return lduw_p(buf);
+} else if (size == 4) {
+return ldl_p(buf);
+} else {
+assert(false);
+}
+}
+
+static void raven_io_write(void *opaque, hwaddr addr,
+   uint64_t val, unsigned int size)
+{
+PREPPCIState *s = opaque;
+uint8_t buf[4];
+
+addr = raven_io_address(s, addr);
+
+if (size == 1) {
+buf[0] = val;
+} else if (size == 2) {
+stw_p(buf, val);
+} else if (size == 4) {
+stl_p(buf, val);
+} else {
+assert(false);
+}
+
+address_space_write(&s->pci_io_as, addr + 0x8000, buf, size);
+}
+
+static const MemoryRegionOps raven_io_ops = {
+.read = raven_io_read,
+.write = raven_io_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.impl.max_access_size = 4,
+.valid.unaligned = true,
+};
+
 static int prep_map_irq(PCIDevice *pci_dev, int irq_num)
 {
 return (irq_num + (pci_dev->devfn >> 3)) & 1;
@@ -131,6 +200,12 @@ static AddressSpace *raven_pcihost_set_iommu(PCIBus *bus, 
void *opaque,
 return &s->bm_as;
 }
 
+static void raven_change_gpio(void *opaque, int n, int level)
+{
+PREPPCIState *s = opaque;
+s->contiguous_map = level;
+}
+
 static void raven_pcihost_realizefn(DeviceState *d, Error **errp)
 {
 SysBusDevice *dev = SYS_BUS_DEVICE(d);
@@ -143,6 +218,8 @@ static void raven_pcihost_realizefn(DeviceState *d, Error 
**errp)
 sysbus_init_irq(dev, &s->irq[i]);
 }
 
+qdev_init_gpio_in(d, raven_change_gpio, 1);
+
 pci_bus_irqs(&s->pci_bus, prep_set_irq, prep_map_irq, s->irq, 
PCI_NUM_PINS);
 
 memory_region_init_io(&h->conf_mem, OBJECT(h), &pci_host_conf_be_ops, s,
@@ -172,12 +249,17 @@ static void raven_pcihost_initfn(Object *obj)
 DeviceState *pci_dev;
 
 memory_region_init(&s->pci_io, obj, "pci-io", 0x3f80);
+memory_region_init_io(&s->pci_io_non_contiguous, obj, &raven_io_ops, s,
+  "pci-io-non-contiguous", 0x0080);
 /* Open Hack'Ware hack: real size should be only 0x3f00 bytes */
 memory_region_init(&s->pci_memory, obj, "pci-memory",
0x3f00 + 0xc000ULL);
+address_space_init(&s->pci_io_as, &s->pci_io, "raven-io");
 
 /* CPU address space */
 memory_region_add_subregion(address_space_mem, 0x8000, &s->pci_io);
+memory_region_add_subregion_overlap(address_space_mem, 0x8000,
+&s->pci_io_non_contiguous, 1);
 memory_region_add_subregion(address_space_mem, 0xc000, &s->pci_memory);
 pci_bus_new_inplace(&s->pci_bus, DEVICE(obj), NULL,
 &s->pci_memory, &s->pci_io, 0, TYPE_PCI_BUS);
diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index e75c4f0..70132a6 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -185,6 +185,7 @@ typedef struct sysctrl_t {
 uint8_t state;
 uint8_t syscontrol;
 int contiguous_map;
+qemu_irq contiguous_map_irq;
 int endian;
 } sysctrl_t;
 
@@ -253,6 +254,7 @@ static void PREP_io_800_writeb (void *opaque, uint32_t 
addr, uint32_t val)
 case 0x0850:
 /* I/O map type register */
 sysctrl->contiguous_map = val & 0x01;
+qemu_set_irq(sysctrl->contiguo

[Qemu-devel] [PATCH v2 07/10] raven: add PCI bus mastering address space

2013-09-03 Thread Hervé Poussineau

This has been tested on Linux 2.4/PPC with the lsi53c895a SCSI adapter.

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |   23 +++
 1 file changed, 23 insertions(+)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index bba76af..3baf66f 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -56,6 +56,10 @@ typedef struct PRePPCIState {
 MemoryRegion pci_io;
 MemoryRegion pci_memory;
 MemoryRegion pci_intack;
+MemoryRegion bm;
+MemoryRegion bm_ram_alias;
+MemoryRegion bm_pci_memory_alias;
+AddressSpace bm_as;
 RavenPCIState pci_dev;
 } PREPPCIState;
 
@@ -120,6 +124,13 @@ static void prep_set_irq(void *opaque, int irq_num, int 
level)
 qemu_set_irq(pic[irq_num] , level);
 }
 
+static AddressSpace *raven_pcihost_set_iommu(PCIBus *bus, void *opaque,
+ int devfn)
+{
+PREPPCIState *s = opaque;
+return &s->bm_as;
+}
+
 static void raven_pcihost_realizefn(DeviceState *d, Error **errp)
 {
 SysBusDevice *dev = SYS_BUS_DEVICE(d);
@@ -171,6 +182,18 @@ static void raven_pcihost_initfn(Object *obj)
 pci_bus_new_inplace(&s->pci_bus, DEVICE(obj), NULL,
 &s->pci_memory, &s->pci_io, 0, TYPE_PCI_BUS);
 
+/* Bus master address space */
+memory_region_init(&s->bm, obj, "bm-raven", UINT32_MAX);
+memory_region_init_alias(&s->bm_pci_memory_alias, obj, "bm-pci-memory",
+ &s->pci_memory, 0,
+ memory_region_size(&s->pci_memory));
+memory_region_init_alias(&s->bm_ram_alias, obj, "bm-system",
+ get_system_memory(), 0, 0x8000);
+memory_region_add_subregion(&s->bm, 0 , &s->bm_pci_memory_alias);
+memory_region_add_subregion(&s->bm, 0x8000, &s->bm_ram_alias);
+address_space_init(&s->bm_as, &s->bm, "raven-bm");
+pci_setup_iommu(&s->pci_bus, raven_pcihost_set_iommu, s);
+
 h->bus = &s->pci_bus;
 
 object_initialize(&s->pci_dev, TYPE_RAVEN_PCI_DEVICE);
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 09/10] raven: fix PCI bus accesses with size > 1

2013-09-03 Thread Hervé Poussineau

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index db03adc..38df10c 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -222,12 +222,12 @@ static void raven_pcihost_realizefn(DeviceState *d, Error 
**errp)
 
 pci_bus_irqs(&s->pci_bus, prep_set_irq, prep_map_irq, s->irq, 
PCI_NUM_PINS);
 
-memory_region_init_io(&h->conf_mem, OBJECT(h), &pci_host_conf_be_ops, s,
-  "pci-conf-idx", 1);
+memory_region_init_io(&h->conf_mem, OBJECT(h), &pci_host_conf_le_ops, s,
+  "pci-conf-idx", 4);
 memory_region_add_subregion(&s->pci_io, 0xcf8, &h->conf_mem);
 
-memory_region_init_io(&h->data_mem, OBJECT(h), &pci_host_data_be_ops, s,
-  "pci-conf-data", 1);
+memory_region_init_io(&h->data_mem, OBJECT(h), &pci_host_data_le_ops, s,
+  "pci-conf-data", 4);
 memory_region_add_subregion(&s->pci_io, 0xcfc, &h->data_mem);
 
 memory_region_init_io(&h->mmcfg, OBJECT(s), &PPC_PCIIO_ops, s, "pciio", 
0x0040);
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 01/10] prep: kill get_system_io() usage

2013-09-03 Thread Hervé Poussineau

While ISA address space in prep machine is currently the one returned
by get_system_io(), this depends of the implementation of i82378/raven
devices, and this may not be the case forever.

Use the right ISA address space when adding some more ports to it.
We can use whatever ISA device on the right ISA bus, as all ISA devices
on the same ISA bus share the same ISA address space.

Signed-off-by: Hervé Poussineau 
---
 hw/ppc/prep.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index 7e04b1a..efc892d 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -656,7 +656,7 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
 sysctrl->reset_irq = cpu->env.irq_inputs[PPC6xx_INPUT_HRESET];
 
 portio_list_init(port_list, NULL, prep_portio_list, sysctrl, "prep");
-portio_list_add(port_list, get_system_io(), 0x0);
+portio_list_add(port_list, isa_address_space_io(isa), 0x0);
 
 /* PowerPC control and status register group */
 #if 0
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 04/10] raven: rename intack region to pci_intack

2013-09-03 Thread Hervé Poussineau

Regions added in next patches will also have the pci_ prefix.

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index 25baef1..95fa2ea 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -51,9 +51,9 @@ typedef struct RavenPCIState {
 typedef struct PRePPCIState {
 PCIHostState parent_obj;
 
-MemoryRegion intack;
 qemu_irq irq[PCI_NUM_PINS];
 PCIBus pci_bus;
+MemoryRegion pci_intack;
 RavenPCIState pci_dev;
 } PREPPCIState;
 
@@ -147,8 +147,9 @@ static void raven_pcihost_realizefn(DeviceState *d, Error 
**errp)
 memory_region_init_io(&h->mmcfg, OBJECT(s), &PPC_PCIIO_ops, s, "pciio", 
0x0040);
 memory_region_add_subregion(address_space_mem, 0x8080, &h->mmcfg);
 
-memory_region_init_io(&s->intack, OBJECT(s), &PPC_intack_ops, s, 
"pci-intack", 1);
-memory_region_add_subregion(address_space_mem, 0xbff0, &s->intack);
+memory_region_init_io(&s->pci_intack, OBJECT(s), &PPC_intack_ops, s,
+  "pci-intack", 1);
+memory_region_add_subregion(address_space_mem, 0xbff0, &s->pci_intack);
 
 /* TODO Remove once realize propagates to child devices. */
 object_property_set_bool(OBJECT(&s->pci_dev), true, "realized", errp);
-- 
1.7.10.4

[Qemu-devel] [PATCH v2 03/10] raven: move BIOS loading from board code to PCI host

2013-09-03 Thread Hervé Poussineau

Raven datasheet explains where firmware lives in system memory, so do
it there instead of in board code. Other boards using the same PCI
host will not have to copy the firmware loading code.

However, add a specific hack for Open Hack'Ware, which provides only
a 512KB blob to be loaded at 0xfff0, but expects valid code at
0xfffc (specific Open Hack'Ware reset instruction pointer).

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |   51 +++
 hw/ppc/prep.c  |   50 +-
 2 files changed, 64 insertions(+), 37 deletions(-)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index 557486e..25baef1 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -28,7 +28,9 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci/pci_host.h"
 #include "hw/i386/pc.h"
+#include "hw/loader.h"
 #include "exec/address-spaces.h"
+#include "elf.h"
 
 #define TYPE_RAVEN_PCI_DEVICE "raven"
 #define TYPE_RAVEN_PCI_HOST_BRIDGE "raven-pcihost"
@@ -38,6 +40,9 @@
 
 typedef struct RavenPCIState {
 PCIDevice dev;
+uint32_t elf_machine;
+char *bios_name;
+MemoryRegion bios;
 } RavenPCIState;
 
 #define RAVEN_PCI_HOST_BRIDGE(obj) \
@@ -52,6 +57,8 @@ typedef struct PRePPCIState {
 RavenPCIState pci_dev;
 } PREPPCIState;
 
+#define BIOS_SIZE (1024 * 1024)
+
 static inline uint32_t PPC_PCIIO_config(hwaddr addr)
 {
 int i;
@@ -169,10 +176,46 @@ static void raven_pcihost_initfn(Object *obj)
 
 static int raven_init(PCIDevice *d)
 {
+Object *obj = OBJECT(d);
+RavenPCIState *s = RAVEN_PCI_DEVICE(d);
+char *filename;
+int bios_size = -1;
+
 d->config[0x0C] = 0x08; // cache_line_size
 d->config[0x0D] = 0x10; // latency_timer
 d->config[0x34] = 0x00; // capabilities_pointer
 
+memory_region_init_ram(&s->bios, obj, "bios", BIOS_SIZE);
+memory_region_set_readonly(&s->bios, true);
+memory_region_add_subregion(get_system_memory(), (uint32_t)(-BIOS_SIZE),
+&s->bios);
+vmstate_register_ram_global(&s->bios);
+if (s->bios_name) {
+filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, s->bios_name);
+if (filename) {
+if (s->elf_machine != EM_NONE) {
+bios_size = load_elf(filename, NULL, NULL, NULL,
+ NULL, NULL, 1, s->elf_machine, 0);
+}
+if (bios_size < 0) {
+bios_size = get_image_size(filename);
+if (bios_size > 0 && bios_size <= BIOS_SIZE) {
+hwaddr bios_addr;
+bios_size = (bios_size + 0xfff) & ~0xfff;
+bios_addr = (uint32_t)(-BIOS_SIZE);
+bios_size = load_image_targphys(filename, bios_addr,
+bios_size);
+}
+}
+}
+if (bios_size < 0 || bios_size > BIOS_SIZE) {
+hw_error("qemu: could not load bios image '%s'\n", s->bios_name);
+}
+if (filename) {
+g_free(filename);
+}
+}
+
 return 0;
 }
 
@@ -208,12 +251,20 @@ static const TypeInfo raven_info = {
 .class_init = raven_class_init,
 };
 
+static Property raven_pcihost_properties[] = {
+DEFINE_PROP_UINT32("elf-machine", PREPPCIState, pci_dev.elf_machine,
+   EM_NONE),
+DEFINE_PROP_STRING("bios-name", PREPPCIState, pci_dev.bios_name),
+DEFINE_PROP_END_OF_LIST()
+};
+
 static void raven_pcihost_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 dc->realize = raven_pcihost_realizefn;
+dc->props = raven_pcihost_properties;
 dc->fw_name = "pci";
 dc->no_user = 1;
 }
diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index efc892d..6df4324 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -456,7 +456,6 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
 MemoryRegion *sysmem = get_system_memory();
 PowerPCCPU *cpu = NULL;
 CPUPPCState *env = NULL;
-char *filename;
 nvram_t nvram;
 M48t59State *m48t59;
 MemoryRegion *PPC_io_memory = g_new(MemoryRegion, 1);
@@ -464,7 +463,7 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
 #if 0
 MemoryRegion *xcsr = g_new(MemoryRegion, 1);
 #endif
-int linux_boot, i, nb_nics1, bios_size;
+int linux_boot, i, nb_nics1;
 MemoryRegion *ram = g_new(MemoryRegion, 1);
 MemoryRegion *bios = g_new(MemoryRegion, 1);
 uint32_t kernel_base, initrd_base;
@@ -510,41 +509,13 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
 memory_region_add_subregion(sysmem, 0, ram);
 
 /* allocate and load BIOS */
-memory_region_init_ram(bios, NULL, "ppc_prep.bios", BIOS_SIZE);
-memory_region_set_readonly(bios, true);
-memory_region_add_subregion(sysmem, (uint32_t)(-BIOS_SIZE), bios);
-vmstate_register_ram_global(bi

[Qemu-devel] [PATCH v2 00/10] prep: improve Raven PCI host emulation

2013-09-03 Thread Hervé Poussineau

This patchset improves Raven PCI host emulation, found in some PPC platforms,
like the QEMU 'prep' one, and for example the IBM RS/6000 40p.

Some features added to raven emulation were already present in prep board
(non contiguous I/O, firmware loading), while some other are new (PCI bus
mastering memory region).

This patchset has been tested against Linux 2.4 PPC and IBM RS/6000 40p
firmware.

Notable achievements are PCI bus mastering (tested with lsi53c895a SCSI
adapter), lots of cleanup and emulation correctness, and also documentation
of current hacks required by Open Hack'Ware.
This gives us a good base to replace OpenHack'Ware by a possible upcoming
OpenBIOS release.

Changes since v1:
- reworked a dubious memcpy to make it work on big endian hosts
- split onto multiple patches

Hervé Poussineau (10):
  prep: kill get_system_io() usage
  raven: use constant PCI_NUM_PINS instead of 4
  raven: move BIOS loading from board code to PCI host
  raven: rename intack region to pci_intack
  raven: set a correct PCI I/O memory region
  raven: set a correct PCI memory region
  raven: add PCI bus mastering address space
  raven: implement non-contiguous I/O region
  raven: fix PCI bus accesses with size > 1
  raven: use raven_ for all function prefixes

 hw/pci-host/prep.c |  235 
 hw/ppc/prep.c  |  155 ++
 2 files changed, 226 insertions(+), 164 deletions(-)

-- 
1.7.10.4

[Qemu-devel] [PATCH v2 02/10] raven: use constant PCI_NUM_PINS instead of 4

2013-09-03 Thread Hervé Poussineau

Signed-off-by: Hervé Poussineau 
---
 hw/pci-host/prep.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index e120058..557486e 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -47,7 +47,7 @@ typedef struct PRePPCIState {
 PCIHostState parent_obj;
 
 MemoryRegion intack;
-qemu_irq irq[4];
+qemu_irq irq[PCI_NUM_PINS];
 PCIBus pci_bus;
 RavenPCIState pci_dev;
 } PREPPCIState;
@@ -121,11 +121,11 @@ static void raven_pcihost_realizefn(DeviceState *d, Error 
**errp)
 
 isa_mem_base = 0xc000;
 
-for (i = 0; i < 4; i++) {
+for (i = 0; i < PCI_NUM_PINS; i++) {
 sysbus_init_irq(dev, &s->irq[i]);
 }
 
-pci_bus_irqs(&s->pci_bus, prep_set_irq, prep_map_irq, s->irq, 4);
+pci_bus_irqs(&s->pci_bus, prep_set_irq, prep_map_irq, s->irq, 
PCI_NUM_PINS);
 
 memory_region_init_io(&h->conf_mem, OBJECT(h), &pci_host_conf_be_ops, s,
   "pci-conf-idx", 1);
-- 
1.7.10.4

[Qemu-devel] [PATCH v5 13/16] block: vhdx - move more endian translations to vhdx-endian.c

2013-09-03 Thread Jeff Cody

In preperation for vhdx_create(), move more endian translation
functions out to vhdx-endian.c.

Signed-off-by: Jeff Cody 
---
 block/vhdx-endian.c | 75 +
 block/vhdx.c| 20 +++---
 block/vhdx.h|  9 ++-
 3 files changed, 87 insertions(+), 17 deletions(-)

diff --git a/block/vhdx-endian.c b/block/vhdx-endian.c
index 3e93e63..fe879ed 100644
--- a/block/vhdx-endian.c
+++ b/block/vhdx-endian.c
@@ -139,3 +139,78 @@ void vhdx_log_entry_hdr_le_export(VHDXLogEntryHeader *hdr)
 }
 
 
+/* Region table entries */
+void vhdx_region_header_le_import(VHDXRegionTableHeader *hdr)
+{
+assert(hdr != NULL);
+
+le32_to_cpus(&hdr->signature);
+le32_to_cpus(&hdr->checksum);
+le32_to_cpus(&hdr->entry_count);
+}
+
+void vhdx_region_header_le_export(VHDXRegionTableHeader *hdr)
+{
+assert(hdr != NULL);
+
+cpu_to_le32s(&hdr->signature);
+cpu_to_le32s(&hdr->checksum);
+cpu_to_le32s(&hdr->entry_count);
+}
+
+void vhdx_region_entry_le_import(VHDXRegionTableEntry *e)
+{
+assert(e != NULL);
+
+leguid_to_cpus(&e->guid);
+le64_to_cpus(&e->file_offset);
+le32_to_cpus(&e->length);
+le32_to_cpus(&e->data_bits);
+}
+
+void vhdx_region_entry_le_export(VHDXRegionTableEntry *e)
+{
+assert(e != NULL);
+
+cpu_to_leguids(&e->guid);
+cpu_to_le64s(&e->file_offset);
+cpu_to_le32s(&e->length);
+cpu_to_le32s(&e->data_bits);
+}
+
+
+/* Metadata headers & table */
+void vhdx_metadata_header_le_import(VHDXMetadataTableHeader *hdr)
+{
+assert(hdr != NULL);
+
+le64_to_cpus(&hdr->signature);
+le16_to_cpus(&hdr->entry_count);
+}
+
+void vhdx_metadata_header_le_export(VHDXMetadataTableHeader *hdr)
+{
+assert(hdr != NULL);
+
+cpu_to_le64s(&hdr->signature);
+cpu_to_le16s(&hdr->entry_count);
+}
+
+void vhdx_metadata_entry_le_import(VHDXMetadataTableEntry *e)
+{
+assert(e != NULL);
+
+leguid_to_cpus(&e->item_id);
+le32_to_cpus(&e->offset);
+le32_to_cpus(&e->length);
+le32_to_cpus(&e->data_bits);
+}
+void vhdx_metadata_entry_le_export(VHDXMetadataTableEntry *e)
+{
+assert(e != NULL);
+
+cpu_to_leguids(&e->item_id);
+cpu_to_le32s(&e->offset);
+cpu_to_le32s(&e->length);
+cpu_to_le32s(&e->data_bits);
+}
diff --git a/block/vhdx.c b/block/vhdx.c
index 8cba312..2944093 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -472,10 +472,7 @@ static int vhdx_open_region_tables(BlockDriverState *bs, 
BDRVVHDXState *s)
 goto fail;
 }
 memcpy(&s->rt, buffer, sizeof(s->rt));
-le32_to_cpus(&s->rt.signature);
-le32_to_cpus(&s->rt.checksum);
-le32_to_cpus(&s->rt.entry_count);
-le32_to_cpus(&s->rt.reserved);
+vhdx_region_header_le_import(&s->rt);
 offset += sizeof(s->rt);
 
 if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4) ||
@@ -494,10 +491,7 @@ static int vhdx_open_region_tables(BlockDriverState *bs, 
BDRVVHDXState *s)
 memcpy(&rt_entry, buffer + offset, sizeof(rt_entry));
 offset += sizeof(rt_entry);
 
-leguid_to_cpus(&rt_entry.guid);
-le64_to_cpus(&rt_entry.file_offset);
-le32_to_cpus(&rt_entry.length);
-le32_to_cpus(&rt_entry.data_bits);
+vhdx_region_entry_le_import(&rt_entry);
 
 /* check for region overlap between these entries, and any
  * other memory regions in the file */
@@ -587,9 +581,7 @@ static int vhdx_parse_metadata(BlockDriverState *bs, 
BDRVVHDXState *s)
 memcpy(&s->metadata_hdr, buffer, sizeof(s->metadata_hdr));
 offset += sizeof(s->metadata_hdr);
 
-le64_to_cpus(&s->metadata_hdr.signature);
-le16_to_cpus(&s->metadata_hdr.reserved);
-le16_to_cpus(&s->metadata_hdr.entry_count);
+vhdx_metadata_header_le_import(&s->metadata_hdr);
 
 if (memcmp(&s->metadata_hdr.signature, "metadata", 8)) {
 ret = -EINVAL;
@@ -608,11 +600,7 @@ static int vhdx_parse_metadata(BlockDriverState *bs, 
BDRVVHDXState *s)
 memcpy(&md_entry, buffer + offset, sizeof(md_entry));
 offset += sizeof(md_entry);
 
-leguid_to_cpus(&md_entry.item_id);
-le32_to_cpus(&md_entry.offset);
-le32_to_cpus(&md_entry.length);
-le32_to_cpus(&md_entry.data_bits);
-le32_to_cpus(&md_entry.reserved2);
+vhdx_metadata_entry_le_import(&md_entry);
 
 if (guid_eq(md_entry.item_id, file_param_guid)) {
 if (s->metadata_entries.present & META_FILE_PARAMETER_PRESENT) {
diff --git a/block/vhdx.h b/block/vhdx.h
index 42089d3..d35345d 100644
--- a/block/vhdx.h
+++ b/block/vhdx.h
@@ -419,7 +419,14 @@ void vhdx_log_desc_le_export(VHDXLogDescriptor *d);
 void vhdx_log_data_le_export(VHDXLogDataSector *d);
 void vhdx_log_entry_hdr_le_import(VHDXLogEntryHeader *hdr);
 void vhdx_log_entry_hdr_le_export(VHDXLogEntryHeader *hdr);
-
+void vhdx_region_header_le_import(VHDXRegionTableHeader *hdr);
+void vhdx_region_header_le_export(VHDXRegionTableHeader *hdr);
+void vhdx_region_entry_le_impo

[Qemu-devel] [PATCH v6 03/24] target-arm: Extract the disas struct to a header file

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

We will need to share the disassembly status struct between AArch32 and
AArch64 modes. So put it into a header file that both sides can use.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-2-git-send-email-john.ri...@linaro.org
Signed-off-by: Peter Maydell 
---
 target-arm/translate.c |   24 +---
 target-arm/translate.h |   27 +++
 2 files changed, 28 insertions(+), 23 deletions(-)
 create mode 100644 target-arm/translate.h

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 8e58eb1..1bb6f46 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -46,29 +46,7 @@
 
 #define ARCH(x) do { if (!ENABLE_ARCH_##x) goto illegal_op; } while(0)
 
-/* internal defines */
-typedef struct DisasContext {
-target_ulong pc;
-int is_jmp;
-/* Nonzero if this instruction has been conditionally skipped.  */
-int condjmp;
-/* The label that will be jumped to when the instruction is skipped.  */
-int condlabel;
-/* Thumb-2 conditional execution bits.  */
-int condexec_mask;
-int condexec_cond;
-struct TranslationBlock *tb;
-int singlestep_enabled;
-int thumb;
-int bswap_code;
-#if !defined(CONFIG_USER_ONLY)
-int user;
-#endif
-int vfp_enabled;
-int vec_len;
-int vec_stride;
-} DisasContext;
-
+#include "translate.h"
 static uint32_t gen_opc_condexec_bits[OPC_BUF_SIZE];
 
 #if defined(CONFIG_USER_ONLY)
diff --git a/target-arm/translate.h b/target-arm/translate.h
new file mode 100644
index 000..e727bc6
--- /dev/null
+++ b/target-arm/translate.h
@@ -0,0 +1,27 @@
+#ifndef TARGET_ARM_TRANSLATE_H
+#define TARGET_ARM_TRANSLATE_H
+
+/* internal defines */
+typedef struct DisasContext {
+target_ulong pc;
+int is_jmp;
+/* Nonzero if this instruction has been conditionally skipped.  */
+int condjmp;
+/* The label that will be jumped to when the instruction is skipped.  */
+int condlabel;
+/* Thumb-2 conditional execution bits.  */
+int condexec_mask;
+int condexec_cond;
+struct TranslationBlock *tb;
+int singlestep_enabled;
+int thumb;
+int bswap_code;
+#if !defined(CONFIG_USER_ONLY)
+int user;
+#endif
+int vfp_enabled;
+int vec_len;
+int vec_stride;
+} DisasContext;
+
+#endif /* TARGET_ARM_TRANSLATE_H */
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Corey Bryant




On 09/03/2013 04:05 PM, Eduardo Otubo wrote:



On 09/03/2013 03:02 PM, Corey Bryant wrote:



On 08/30/2013 10:21 AM, Eduardo Otubo wrote:



On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:

On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:

Now there's a second whitelist, right before the vcpu starts. The
second
whitelist is the same as the first one, except for exec() and
select().


-netdev tap,downscript=/path/to/script requires exec() in the QEMU
shutdown code path.  Will this work with seccomp?


I actually don't know, but I'll test that as well. Can you run a test
with this patch and -netdev? I mean, if you're pointing that out you
might have a scenario already setup, right?

Thanks!



This uses exec() in net/tap.c.

I think if we're going to introduce a sandbox environment that restricts
existing QEMU behavior, then we have to introduce a new argument to the
-sandbox option.  So for example, "-sandbox on" would continue to use
the whitelist that allows everything in QEMU to work (or at least it
should :).  And something like "-sandbox on,strict=on" would use the
whitelist + blacklist.


I think tihs is very reasonable. I'll working on implementing this
options for v2.



If this is acceptable though, then I wonder how we could go about adding
new syscalls to the blacklist in future QEMU releases without regressing
"-sandbox on,strict=on".

By the way, are any test buckets running regularly with -sandbox on?


I am running tests with virt-test. Need to come up with a script that
checks for unused syscalls, though.



Would it be possible to submit a patch to turn on -sandbox for some/all 
QEMU virt-test tests?  That would enable regular regression runs that 
aren't dependent on you.  Plus it would be a good proof point of the 
QEMU seccomp support if the tests run successfully.


--
Regards,
Corey Bryant

[Qemu-devel] [PATCH v5 00/16] VHDX log replay and write support, .bdrv_create()

2013-09-03 Thread Jeff Cody

This patch series contains the initial VHDX log parsing, replay,
and write support.  (New with v4: VHDX image file creation)

=== v5 changes ===

v5 is also available for testing from:
https://github.com/codyprime/qemu-kvm-jtc/tree/vhdx-write-v5-upstream

Most of the patches from v4 -> v5 are the same, but there are a few differences
and a few new patches.  Here is a summary of which patches are different and/or
new:

$ ~/work/github/git-scripts/git-series-diff -u vhdx-write-v4-upstream -r 
qemu/master..HEAD
Key:
[] : patches are identical
[] : number of functional differences between patches in -u and -r series
[ new] : patch is new in the range given by -r
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/16:[] [--] 'block: vhdx - minor comments and typo correction.'
002/16:[] [--] 'block: vhdx - add header update capability.'
003/16:[] [--] 'block: vhdx code movement - VHDXMetadataEntries and 
BDRVVHDXState to header.'
004/16:[] [--] 'block: vhdx - log support struct and defines'
005/16:[] [--] 'block: vhdx - break endian translation functions out'
006/16:[] [--] 'block: vhdx - update log guid in header, and first write 
tracker'
007/16:[ new] 'block: vhdx code movement - move vhdx_close() above vhdx_open()'
008/16:[0070] [FC] 'block: vhdx - log parsing, replay, and flush support'
009/16:[ new] 'block: vhdx - add region overlap detection for image files'
010/16:[0003] [FC] 'block: vhdx - add log write support'
011/16:[] [--] 'block: vhdx write support'
012/16:[ new] 'block: vhdx - remove BAT file offset bit shifting'
013/16:[] [-C] 'block: vhdx - move more endian translations to 
vhdx-endian.c'
014/16:[] [-C] 'block: vhdx - break out code operations to functions'
015/16:[] [--] 'block: vhdx - fix comment typos in header, fix incorrect 
struct fields'
016/16:[] [-C] 'block: vhdx - add .bdrv_create() support'


Patch highlights:

Patch 7  just some minor code movement, in prep for changes in patch 8

Patch 8  incorporates review feedback from Stefan, for the previous Patch 7
 in v4.

Patch 9  adds region checking for log, region table, and metadata tables, per
 suggestion from Stefan.

Patch 10 minor change from changes made in 8/16 (vhdx_guid_is_zero() is gone)

Patch 12 is just some minor housekeeping, to get rid of bit shifting that
 doesn't need to happen.



=== v4 changes ===  

v4 patches are available from github as well, on branch vhdx-write-v4-upstream:
https://github.com/codyprime/qemu-kvm-jtc/tree/vhdx-write-v4-upstream
https://github.com/codyprime/qemu-kvm-jtc.git

Those in the midst of reviewing v3, don't fear - the only changes with v4 is
the addition of patches on the end of the series (patches 10-13).  These
patches enable creating VHDX images.  Image files created have been
(briefly & lightly) tested on Hyper-V running on Windows Server 2012.

Some of the new patches could be squashed with earlier patches in the series,
but I refrained from doing so, since some of the patches have already been
reviewed, and others are in the midst of review.  I want to make it as easy
as possible on those currently reviewing. There is nothing critical
that needs to be pushed into the earlier patches.

New patches:

Patch 10:  Breaks out some more endian translation functions
(likely squashable into patch 5)

Patch 11:  Break out some operations into seperate helper functions

Patch 12:  More comment typos and header fixes in vhdx.h
(likely squashable into patch 1)

Patch 13:  Adds .bdrv_create() for vhdx.  VHDX images are can be created for
   Fixed or Dynamic images.

Patches 1-9 are unchanged.

=== end v4 changelog ===

=== v3 changes ===  

Thank you Kevin & Stefan for the feedback; incoporated in v3:

Patch 1: --- nil ---

Patch 2: * use sizeof(crc) instead of 4
 * remove outdated comment
 * use sizeof(MSGUID) instead of 16
 * direct assignment of guid structs rather than memcpy
 * rename 'rw' to 'generate_data_write_guid'
 * use offsetof() instead of 4
 * comment typos
 * add missing error checking
 * MSGUID is now QEMU_PACKED
 * configure enable for vhdx is now correct and not braindead

Patch 3: --- nil ---

Patch 4: * code style fixes
 * removed unused struct (VHDXLogEntryInfo)

Patch 5: * more direct assignment of guid rather than memcpy
 * order of operation in export/import the same now
 * became less generous with newlines (bah-humbug!)

Patch 6: * more direct assignment of guid rather than memcpy 
 * add error check in vhdx_user_visible_write(), now returns int

Patch 7: * check error return now of vhdx_user_visible_write()

Patch 8: * check error return now of vhdx_user_visible_write()
 * vhdx_log_write_sectors() uses bdrv_pwrite() vs bdrv_pwrite_sync()
 * more direct assignment of guid rather than memcpy 
 * use off

[Qemu-devel] [PATCH v5 11/16] block: vhdx write support

2013-09-03 Thread Jeff Cody

This adds support for writing to VHDX image files, using coroutines.
Writes into the BAT table goes through the VHDX log.  Currently, BAT
table writes occur when expanding a dynamic VHDX file, and allocating a
new BAT entry.

Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 150 ++-
 1 file changed, 148 insertions(+), 2 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index b15d6e5..69a06a9 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -940,7 +940,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, 
int flags)
 }
 }
 
-/* TODO: differencing files, write */
+/* TODO: differencing files */
 
 return 0;
 fail:
@@ -1069,7 +1069,43 @@ exit:
 return ret;
 }
 
+/*
+ * Allocate a new payload block at the end of the file.
+ *
+ * Allocation will happen at 1MB alignment inside the file
+ *
+ * Returns the file offset start of the new payload block
+ */
+static int vhdx_allocate_block(BlockDriverState *bs, BDRVVHDXState *s,
+uint64_t *new_offset)
+{
+*new_offset = bdrv_getlength(bs->file);
+
+/* per the spec, the address for a block is in units of 1MB */
+*new_offset = ROUND_UP(*new_offset, 1024*1024);
+
+return bdrv_truncate(bs->file, *new_offset + s->block_size);
+}
+
+/*
+ * Update the BAT tablet entry with the new file offset, and the new entry
+ * state */
+static void vhdx_update_bat_table_entry(BlockDriverState *bs, BDRVVHDXState *s,
+   VHDXSectorInfo *sinfo,
+   uint64_t *bat_entry,
+   uint64_t *bat_offset, int state)
+{
+/* The BAT entry is a uint64, with 44 bits for the file offset in units of
+ * 1MB, and 3 bits for the block state. */
+s->bat[sinfo->bat_idx]  = ((sinfo->file_offset>>20) <<
+   VHDX_BAT_FILE_OFF_BITS);
 
+s->bat[sinfo->bat_idx] |= state & VHDX_BAT_STATE_BIT_MASK;
+
+*bat_entry = cpu_to_le64(s->bat[sinfo->bat_idx]);
+*bat_offset = s->bat_offset + sinfo->bat_idx * sizeof(VHDXBatEntry);
+
+}
 
 /* Per the spec, on the first write of guest-visible data to the file the
  * data write guid must be updated in the header */
@@ -1086,7 +1122,117 @@ int vhdx_user_visible_write(BlockDriverState *bs, 
BDRVVHDXState *s)
 static coroutine_fn int vhdx_co_writev(BlockDriverState *bs, int64_t 
sector_num,
   int nb_sectors, QEMUIOVector *qiov)
 {
-return -ENOTSUP;
+int ret = -ENOTSUP;
+BDRVVHDXState *s = bs->opaque;
+VHDXSectorInfo sinfo;
+uint64_t bytes_done = 0;
+uint64_t bat_entry = 0;
+uint64_t bat_entry_offset = 0;
+bool bat_update;
+QEMUIOVector hd_qiov;
+
+qemu_iovec_init(&hd_qiov, qiov->niov);
+
+qemu_co_mutex_lock(&s->lock);
+
+ret = vhdx_user_visible_write(bs, s);
+if (ret < 0) {
+goto exit;
+}
+
+while (nb_sectors > 0) {
+if (s->params.data_bits & VHDX_PARAMS_HAS_PARENT) {
+/* not supported yet */
+ret = -ENOTSUP;
+goto exit;
+} else {
+bat_update = false;
+vhdx_block_translate(s, sector_num, nb_sectors, &sinfo);
+
+qemu_iovec_reset(&hd_qiov);
+qemu_iovec_concat(&hd_qiov, qiov,  bytes_done, sinfo.bytes_avail);
+/* check the payload block state */
+switch (s->bat[sinfo.bat_idx] & VHDX_BAT_STATE_BIT_MASK) {
+case PAYLOAD_BLOCK_ZERO:
+/* in this case, we need to preserve zero writes for
+ * data that is not part of this write, so we must pad
+ * the rest of the buffer to zeroes */
+
+/* if we are on a posix system with ftruncate() that extends
+ * a file, then it is zero-filled for us.  On Win32, the raw
+ * layer uses SetFilePointer and SetFileEnd, which does not
+ * zero fill AFAIK */
+
+/* TODO: queue another write of zero buffers if the host OS 
does
+ * not zero-fill on file extension */
+
+/* fall through */
+case PAYLOAD_BLOCK_NOT_PRESENT: /* fall through */
+case PAYLOAD_BLOCK_UNMAPPED:/* fall through */
+case PAYLOAD_BLOCK_UNDEFINED:   /* fall through */
+ret = vhdx_allocate_block(bs, s, &sinfo.file_offset);
+if (ret < 0) {
+goto exit;
+}
+/* once we support differencing files, this may also be
+ * partially present */
+/* update block state to the newly specified state */
+vhdx_update_bat_table_entry(bs, s, &sinfo, &bat_entry,
+&bat_entry_offset,
+PAYLOAD_BLOCK_FULL_PRESENT);
+bat_update = true;
+/* si

[Qemu-devel] [PATCH v5 14/16] block: vhdx - break out code operations to functions

2013-09-03 Thread Jeff Cody

This is preperation for vhdx_create().  The ability to write headers,
and calculate the number of BAT entries will be needed within the
create() functions, so move this relevant code into helper functions.

Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 121 +++
 1 file changed, 80 insertions(+), 41 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 2944093..94fa84f 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -248,6 +248,14 @@ static void vhdx_region_unregister_all(BDRVVHDXState *s)
 }
 }
 
+static void vhdx_set_shift_bits(BDRVVHDXState *s)
+{
+s->logical_sector_size_bits = 31 - clz32(s->logical_sector_size);
+s->sectors_per_block_bits =   31 - clz32(s->sectors_per_block);
+s->chunk_ratio_bits = 63 - clz64(s->chunk_ratio);
+s->block_size_bits =  31 - clz32(s->block_size);
+}
+
 /*
  * Per the MS VHDX Specification, for every VHDX file:
  *  - The header section is fixed size - 1 MB
@@ -267,6 +275,50 @@ static int vhdx_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 return 0;
 }
 
+/*
+ * Writes the header to the specified offset.
+ *
+ * This will optionally read in buffer data from disk (otherwise zero-fill),
+ * and then update the header checksum.  Header is converted to proper
+ * endianness before being written to the specified file offset
+ */
+static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
+ uint64_t offset, bool read)
+{
+uint8_t *buffer = NULL;
+int ret;
+VHDXHeader header_le;
+
+assert(bs_file != NULL);
+assert(hdr != NULL);
+
+/* the header checksum is not over just the packed size of VHDXHeader,
+ * but rather over the entire 'reserved' range for the header, which is
+ * 4KB (VHDX_HEADER_SIZE). */
+
+buffer = qemu_blockalign(bs_file, VHDX_HEADER_SIZE);
+if (read) {
+/* if true, we can't assume the extra reserved bytes are 0 */
+ret = bdrv_pread(bs_file, offset, buffer, VHDX_HEADER_SIZE);
+if (ret < 0) {
+goto exit;
+}
+} else {
+memset(buffer, 0, VHDX_HEADER_SIZE);
+}
+
+/* overwrite the actual VHDXHeader portion */
+memcpy(buffer, hdr, sizeof(VHDXHeader));
+hdr->checksum = vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
+ offsetof(VHDXHeader, checksum));
+vhdx_header_le_export(hdr, &header_le);
+ret = bdrv_pwrite_sync(bs_file, offset, &header_le, sizeof(VHDXHeader));
+
+exit:
+qemu_vfree(buffer);
+return ret;
+}
+
 /* Update the VHDX headers
  *
  * This follows the VHDX spec procedures for header updates.
@@ -282,8 +334,6 @@ static int vhdx_update_header(BlockDriverState *bs, 
BDRVVHDXState *s,
 
 VHDXHeader *active_header;
 VHDXHeader *inactive_header;
-VHDXHeader header_le;
-uint8_t *buffer;
 
 /* operate on the non-current header */
 if (s->curr_header == 0) {
@@ -311,31 +361,13 @@ static int vhdx_update_header(BlockDriverState *bs, 
BDRVVHDXState *s,
 inactive_header->log_guid = *log_guid;
 }
 
-/* the header checksum is not over just the packed size of VHDXHeader,
- * but rather over the entire 'reserved' range for the header, which is
- * 4KB (VHDX_HEADER_SIZE). */
-
-buffer = qemu_blockalign(bs, VHDX_HEADER_SIZE);
-/* we can't assume the extra reserved bytes are 0 */
-ret = bdrv_pread(bs->file, header_offset, buffer, VHDX_HEADER_SIZE);
-if (ret < 0) {
-goto exit;
-}
-/* overwrite the actual VHDXHeader portion */
-memcpy(buffer, inactive_header, sizeof(VHDXHeader));
-inactive_header->checksum =
-vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
- offsetof(VHDXHeader, checksum));
-vhdx_header_le_export(inactive_header, &header_le);
-ret = bdrv_pwrite_sync(bs->file, header_offset, &header_le,
-   sizeof(VHDXHeader));
+vhdx_write_header(bs->file, inactive_header, header_offset, true);
 if (ret < 0) {
 goto exit;
 }
 s->curr_header = hdr_idx;
 
 exit:
-qemu_vfree(buffer);
 return ret;
 }
 
@@ -773,10 +805,7 @@ static int vhdx_parse_metadata(BlockDriverState *bs, 
BDRVVHDXState *s)
 goto exit;
 }
 
-s->logical_sector_size_bits = 31 - clz32(s->logical_sector_size);
-s->sectors_per_block_bits =   31 - clz32(s->sectors_per_block);
-s->chunk_ratio_bits = 63 - clz64(s->chunk_ratio);
-s->block_size_bits =  31 - clz32(s->block_size);
+vhdx_set_shift_bits(s);
 
 ret = 0;
 
@@ -785,6 +814,31 @@ exit:
 return ret;
 }
 
+/*
+ * Calculate the number of BAT entries, including sector
+ * bitmap entries.
+ */
+static void vhdx_calc_bat_entries(BDRVVHDXState *s)
+{
+uint32_t data_blocks_cnt, bitmap_blocks_cnt;
+
+data_blocks_cnt = s->virtual_disk_size >> s->block_size_bits;
+if (s->vir

[Qemu-devel] [PATCH v5 06/16] block: vhdx - update log guid in header, and first write tracker

2013-09-03 Thread Jeff Cody

Allow tracking of first file write in the VHDX image, as well as
the ability to update the GUID in the header.  This is in preparation
for log support.

Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 30 --
 block/vhdx.h |  6 ++
 2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 15a4d1d..4dc056b 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -229,7 +229,7 @@ static int vhdx_probe(const uint8_t *buf, int buf_size, 
const char *filename)
  *  - non-current header is updated with largest sequence number
  */
 static int vhdx_update_header(BlockDriverState *bs, BDRVVHDXState *s,
-  bool generate_data_write_guid)
+  bool generate_data_write_guid, MSGUID *log_guid)
 {
 int ret = 0;
 int hdr_idx = 0;
@@ -261,6 +261,11 @@ static int vhdx_update_header(BlockDriverState *bs, 
BDRVVHDXState *s,
 vhdx_guid_generate(&inactive_header->data_write_guid);
 }
 
+/* update the log guid if present */
+if (log_guid) {
+inactive_header->log_guid = *log_guid;
+}
+
 /* the header checksum is not over just the packed size of VHDXHeader,
  * but rather over the entire 'reserved' range for the header, which is
  * 4KB (VHDX_HEADER_SIZE). */
@@ -293,16 +298,16 @@ exit:
  * The VHDX spec calls for header updates to be performed twice, so that both
  * the current and non-current header have valid info
  */
-static int vhdx_update_headers(BlockDriverState *bs, BDRVVHDXState *s,
-   bool generate_data_write_guid)
+int vhdx_update_headers(BlockDriverState *bs, BDRVVHDXState *s,
+bool generate_data_write_guid, MSGUID *log_guid)
 {
 int ret;
 
-ret = vhdx_update_header(bs, s, generate_data_write_guid);
+ret = vhdx_update_header(bs, s, generate_data_write_guid, log_guid);
 if (ret < 0) {
 return ret;
 }
-ret = vhdx_update_header(bs, s, generate_data_write_guid);
+ret = vhdx_update_header(bs, s, generate_data_write_guid, log_guid);
 return ret;
 }
 
@@ -782,6 +787,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, 
int flags)
 
 
 s->bat = NULL;
+s->first_visible_write = true;
 
 qemu_co_mutex_init(&s->lock);
 
@@ -862,7 +868,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, 
int flags)
 }
 
 if (flags & BDRV_O_RDWR) {
-ret = vhdx_update_headers(bs, s, false);
+ret = vhdx_update_headers(bs, s, false, NULL);
 if (ret < 0) {
 goto fail;
 }
@@ -1002,6 +1008,18 @@ exit:
 
 
 
+/* Per the spec, on the first write of guest-visible data to the file the
+ * data write guid must be updated in the header */
+int vhdx_user_visible_write(BlockDriverState *bs, BDRVVHDXState *s)
+{
+int ret = 0;
+if (s->first_visible_write) {
+s->first_visible_write = false;
+ret = vhdx_update_headers(bs, s, true, NULL);
+}
+return ret;
+}
+
 static coroutine_fn int vhdx_co_writev(BlockDriverState *bs, int64_t 
sector_num,
   int nb_sectors, QEMUIOVector *qiov)
 {
diff --git a/block/vhdx.h b/block/vhdx.h
index 89d9a78..8c61bfd 100644
--- a/block/vhdx.h
+++ b/block/vhdx.h
@@ -361,6 +361,7 @@ typedef struct BDRVVHDXState {
 VHDXBatEntry *bat;
 uint64_t bat_offset;
 
+bool first_visible_write;
 MSGUID session_guid;
 
 VHDXLogEntries log;
@@ -372,6 +373,9 @@ typedef struct BDRVVHDXState {
 
 void vhdx_guid_generate(MSGUID *guid);
 
+int vhdx_update_headers(BlockDriverState *bs, BDRVVHDXState *s, bool rw,
+MSGUID *log_guid);
+
 uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset);
 uint32_t vhdx_checksum_calc(uint32_t crc, uint8_t *buf, size_t size,
 int crc_offset);
@@ -401,4 +405,6 @@ void vhdx_log_data_le_export(VHDXLogDataSector *d);
 void vhdx_log_entry_hdr_le_import(VHDXLogEntryHeader *hdr);
 void vhdx_log_entry_hdr_le_export(VHDXLogEntryHeader *hdr);
 
+int vhdx_user_visible_write(BlockDriverState *bs, BDRVVHDXState *s);
+
 #endif
-- 
1.8.1.4

[Qemu-devel] [PATCH v5 12/16] block: vhdx - remove BAT file offset bit shifting

2013-09-03 Thread Jeff Cody

Bit shifting can be fun, but in this case it was unnecessary.  The
upper 44 bits of the 64-bit BAT entry is specifies the File Offset,
so we shifted the bits to get access to the value.

However, per the spec the value is in MB.  So we dutifully shifted back
to the left by 20 bits, to convert to a true uint64_t file offset.

This replaces those steps with just a bit mask, to get rid of the lower
20 bits instead.

Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 6 ++
 block/vhdx.h | 1 -
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 69a06a9..8cba312 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -984,7 +984,7 @@ static void vhdx_block_translate(BDRVVHDXState *s, int64_t 
sector_num,
 
 sinfo->bytes_avail = sinfo->sectors_avail << s->logical_sector_size_bits;
 
-sinfo->file_offset = s->bat[sinfo->bat_idx] >> VHDX_BAT_FILE_OFF_BITS;
+sinfo->file_offset = s->bat[sinfo->bat_idx] & VHDX_BAT_FILE_OFF_MASK;
 
 sinfo->block_offset = block_offset << s->logical_sector_size_bits;
 
@@ -998,7 +998,6 @@ static void vhdx_block_translate(BDRVVHDXState *s, int64_t 
sector_num,
  * in the block, and add in the payload data block offset
  * in the file, in bytes, to get the final read address */
 
-sinfo->file_offset <<= 20;  /* now in bytes, rather than 1MB units */
 sinfo->file_offset += sinfo->block_offset;
 }
 
@@ -1097,8 +1096,7 @@ static void vhdx_update_bat_table_entry(BlockDriverState 
*bs, BDRVVHDXState *s,
 {
 /* The BAT entry is a uint64, with 44 bits for the file offset in units of
  * 1MB, and 3 bits for the block state. */
-s->bat[sinfo->bat_idx]  = ((sinfo->file_offset>>20) <<
-   VHDX_BAT_FILE_OFF_BITS);
+s->bat[sinfo->bat_idx]  = sinfo->file_offset;
 
 s->bat[sinfo->bat_idx] |= state & VHDX_BAT_STATE_BIT_MASK;
 
diff --git a/block/vhdx.h b/block/vhdx.h
index 4f32c01..42089d3 100644
--- a/block/vhdx.h
+++ b/block/vhdx.h
@@ -229,7 +229,6 @@ typedef struct QEMU_PACKED VHDXLogDataSector {
 /* upper 44 bits are the file offset in 1MB units lower 3 bits are the state
other bits are reserved */
 #define VHDX_BAT_STATE_BIT_MASK 0x07
-#define VHDX_BAT_FILE_OFF_BITS (64 - 44)
 #define VHDX_BAT_FILE_OFF_MASK  0xFFF0 /* upper 44 bits */
 typedef uint64_t VHDXBatEntry;
 
-- 
1.8.1.4

[Qemu-devel] [PATCH v5 09/16] block: vhdx - add region overlap detection for image files

2013-09-03 Thread Jeff Cody

Regions in the image file cannot overlap - the log, region tables,
and metdata must all be unique and non-overlapping.

This adds region checking by means of a QLIST; there can be a variable
number of regions and metadata (there may be metadata or region tables
that we do not recognize / know about, but are not required).

This adds the capability to register a region for later checking, and
to check against registered regions for any overlap.

Also, if neither the BAT or Metadata region tables are found, return
error.

Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 83 
 block/vhdx.h |  8 ++
 2 files changed, 91 insertions(+)

diff --git a/block/vhdx.c b/block/vhdx.c
index d0499ba..b15d6e5 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -203,6 +203,51 @@ void vhdx_guid_generate(MSGUID *guid)
 memcpy(guid, uuid, sizeof(MSGUID));
 }
 
+/* Check for region overlaps inside the VHDX image */
+static int vhdx_region_check(BDRVVHDXState *s, uint64_t start, uint64_t length)
+{
+int ret = 0;
+uint64_t end;
+VHDXRegionEntry *r;
+
+end = start + length;
+QLIST_FOREACH(r, &s->regions, entries) {
+if ((start >= r->start && start <  r->end) ||
+(end   >  r->start && end   <= r->end)) {
+ret = -EINVAL;
+goto exit;
+}
+}
+
+exit:
+return ret;
+}
+
+/* Register a region for future checks */
+static void vhdx_region_register(BDRVVHDXState *s,
+ uint64_t start, uint64_t length)
+{
+VHDXRegionEntry *r;
+
+r = g_malloc0(sizeof(*r));
+
+r->start = start;
+r->end = start + length;
+
+QLIST_INSERT_HEAD(&s->regions, r, entries);
+}
+
+/* Free all registered regions */
+static void vhdx_region_unregister_all(BDRVVHDXState *s)
+{
+VHDXRegionEntry *r, *r_next;
+
+QLIST_FOREACH_SAFE(r, &s->regions, entries, r_next) {
+QLIST_REMOVE(r, entries);
+g_free(r);
+}
+}
+
 /*
  * Per the MS VHDX Specification, for every VHDX file:
  *  - The header section is fixed size - 1 MB
@@ -388,6 +433,9 @@ static int vhdx_parse_header(BlockDriverState *bs, 
BDRVVHDXState *s)
 }
 }
 
+vhdx_region_register(s, s->headers[s->curr_header]->log_offset,
+s->headers[s->curr_header]->log_length);
+
 ret = 0;
 
 goto exit;
@@ -451,6 +499,15 @@ static int vhdx_open_region_tables(BlockDriverState *bs, 
BDRVVHDXState *s)
 le32_to_cpus(&rt_entry.length);
 le32_to_cpus(&rt_entry.data_bits);
 
+/* check for region overlap between these entries, and any
+ * other memory regions in the file */
+ret = vhdx_region_check(s, rt_entry.file_offset, rt_entry.length);
+if (ret < 0) {
+goto fail;
+}
+
+vhdx_region_register(s, rt_entry.file_offset, rt_entry.length);
+
 /* see if we recognize the entry */
 if (guid_eq(rt_entry.guid, bat_guid)) {
 /* must be unique; if we have already found it this is invalid */
@@ -481,6 +538,12 @@ static int vhdx_open_region_tables(BlockDriverState *bs, 
BDRVVHDXState *s)
 goto fail;
 }
 }
+
+if (!bat_rt_found || !metadata_rt_found) {
+ret = -EINVAL;
+goto fail;
+}
+
 ret = 0;
 
 fail:
@@ -743,6 +806,7 @@ static void vhdx_close(BlockDriverState *bs)
 qemu_vfree(s->bat);
 qemu_vfree(s->parent_entries);
 qemu_vfree(s->log.hdr);
+vhdx_region_unregister_all(s);
 }
 
 static int vhdx_open(BlockDriverState *bs, QDict *options, int flags)
@@ -760,6 +824,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, 
int flags)
 s->log.write = s->log.read = 0;
 
 qemu_co_mutex_init(&s->lock);
+QLIST_INIT(&s->regions);
 
 /* validate the file signature */
 ret = bdrv_pread(bs->file, 0, &signature, sizeof(uint64_t));
@@ -846,8 +911,26 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, 
int flags)
 goto fail;
 }
 
+uint64_t payblocks = s->chunk_ratio;
+/* endian convert, and verify populated BAT field file offsets against
+ * region table and log entries */
 for (i = 0; i < s->bat_entries; i++) {
 le64_to_cpus(&s->bat[i]);
+if (payblocks--) {
+/* payload bat entries */
+if ((s->bat[i] & VHDX_BAT_STATE_BIT_MASK) ==
+PAYLOAD_BLOCK_FULL_PRESENT) {
+ret = vhdx_region_check(s, s->bat[i] & VHDX_BAT_FILE_OFF_MASK,
+s->block_size);
+if (ret < 0) {
+goto fail;
+}
+}
+} else {
+payblocks = s->chunk_ratio;
+/* Once differencing files are supported, verify sector bitmap
+ * blocks here */
+}
 }
 
 if (flags & BDRV_O_RDWR) {
diff --git a/block/vhdx.h b/block/vhdx.h
index dfb3ed9..831aa13 100644
--- a/block/vhdx.h
+++ b/block/vhdx.

[Qemu-devel] [PATCH v5 04/16] block: vhdx - log support struct and defines

2013-09-03 Thread Jeff Cody

This adds some magic number defines, and internal structure definitions
for VHDX log replay support.  The struct VHDXLogEntries does not reflect
an on-disk data structure, and thus does not need to be packed.

Some minor code style fixes are applied as well.

Signed-off-by: Jeff Cody 
---
 block/vhdx.h | 46 ++
 1 file changed, 30 insertions(+), 16 deletions(-)

diff --git a/block/vhdx.h b/block/vhdx.h
index 74b2d5d..0ab8bf3 100644
--- a/block/vhdx.h
+++ b/block/vhdx.h
@@ -30,12 +30,12 @@
  * 
0.64KB...128KB192KB..256KB1MB
  */
 
-#define VHDX_HEADER_BLOCK_SIZE  (64*1024)
+#define VHDX_HEADER_BLOCK_SIZE  (64 * 1024)
 
 #define VHDX_FILE_ID_OFFSET 0
-#define VHDX_HEADER1_OFFSET (VHDX_HEADER_BLOCK_SIZE*1)
-#define VHDX_HEADER2_OFFSET (VHDX_HEADER_BLOCK_SIZE*2)
-#define VHDX_REGION_TABLE_OFFSET(VHDX_HEADER_BLOCK_SIZE*3)
+#define VHDX_HEADER1_OFFSET (VHDX_HEADER_BLOCK_SIZE * 1)
+#define VHDX_HEADER2_OFFSET (VHDX_HEADER_BLOCK_SIZE * 2)
+#define VHDX_REGION_TABLE_OFFSET(VHDX_HEADER_BLOCK_SIZE * 3)
 
 
 /*
@@ -77,10 +77,10 @@ typedef struct QEMU_PACKED MSGUID {
 #define guid_eq(a, b) \
 (memcmp(&(a), &(b), sizeof(MSGUID)) == 0)
 
-#define VHDX_HEADER_SIZE (4*1024)   /* although the vhdx_header struct in disk
-   is only 582 bytes, for purposes of crc
-   the header is the first 4KB of the 64KB
-   block */
+#define VHDX_HEADER_SIZE (4 * 1024)   /* although the vhdx_header struct in 
disk
+ is only 582 bytes, for purposes of crc
+ the header is the first 4KB of the 
64KB
+ block */
 
 /* The full header is 4KB, although the actual header data is much smaller.
  * But for the checksum calculation, it is over the entire 4KB structure,
@@ -92,7 +92,7 @@ typedef struct QEMU_PACKED VHDXHeader {
VHDX file has 2 of these headers,
and only the header with the highest
sequence number is valid */
-MSGUID  file_write_guid;   /* 128 bit unique identifier. Must be
+MSGUID  file_write_guid;/* 128 bit unique identifier. Must be
updated to new, unique value before
the first modification is made to
file */
@@ -151,7 +151,10 @@ typedef struct QEMU_PACKED VHDXRegionTableEntry {
 
 
 /*  LOG ENTRY STRUCTURES  */
+#define VHDX_LOG_MIN_SIZE (1024 * 1024)
+#define VHDX_LOG_SECTOR_SIZE 4096
 #define VHDX_LOG_HDR_SIZE 64
+#define VHDX_LOG_SIGNATURE 0x65676f6c
 typedef struct QEMU_PACKED VHDXLogEntryHeader {
 uint32_tsignature;  /* "loge" in ASCII */
 uint32_tchecksum;   /* CRC-32C hash of the 64KB table */
@@ -174,7 +177,8 @@ typedef struct QEMU_PACKED VHDXLogEntryHeader {
 } VHDXLogEntryHeader;
 
 #define VHDX_LOG_DESC_SIZE 32
-
+#define VHDX_LOG_DESC_SIGNATURE 0x63736564
+#define VHDX_LOG_ZERO_SIGNATURE 0x6f72657a
 typedef struct QEMU_PACKED VHDXLogDescriptor {
 uint32_tsignature;  /* "zero" or "desc" in ASCII */
 union  {
@@ -194,6 +198,7 @@ typedef struct QEMU_PACKED VHDXLogDescriptor {
vhdx_log_entry_header */
 } VHDXLogDescriptor;
 
+#define VHDX_LOG_DATA_SIGNATURE 0x61746164
 typedef struct QEMU_PACKED VHDXLogDataSector {
 uint32_tdata_signature; /* "data" in ASCII */
 uint32_tsequence_high;  /* 4 MSB of 8 byte sequence_number */
@@ -219,12 +224,12 @@ typedef struct QEMU_PACKED VHDXLogDataSector {
 #define SB_BLOCK_PRESENT6
 
 /* per the spec */
-#define VHDX_MAX_SECTORS_PER_BLOCK  (1<<23)
+#define VHDX_MAX_SECTORS_PER_BLOCK  (1 << 23)
 
 /* upper 44 bits are the file offset in 1MB units lower 3 bits are the state
other bits are reserved */
 #define VHDX_BAT_STATE_BIT_MASK 0x07
-#define VHDX_BAT_FILE_OFF_BITS (64-44)
+#define VHDX_BAT_FILE_OFF_BITS (64 - 44)
 typedef uint64_t VHDXBatEntry;
 
 /*  METADATA REGION STRUCTURES  */
@@ -252,8 +257,8 @@ typedef struct QEMU_PACKED VHDXMetadataTableEntry {
metadata region */
 /* note: if length = 0, so is offset */
 uint32_tlength; /* length of metadata. <= 1MB. */
-uint32_tdata_bits;  /* least-significant 3 bits are flags, the
-   rest are reserved (see above) */
+uint32_tdata_bits;  /* least-significant 3 bits are flags,
+   the rest are reserved (see above)

[Qemu-devel] [PATCH v5 01/16] block: vhdx - minor comments and typo correction.

2013-09-03 Thread Jeff Cody

Just a couple of minor comments to help note where allocated
buffers are freed, and a typo fix.

Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 6 --
 block/vhdx.h | 6 +++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index e9704b1..56bc88e 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -6,9 +6,9 @@
  * Authors:
  *  Jeff Cody 
  *
- *  This is based on the "VHDX Format Specification v0.95", published 4/12/2012
+ *  This is based on the "VHDX Format Specification v1.00", published 8/25/2012
  *  by Microsoft:
- *  https://www.microsoft.com/en-us/download/details.aspx?id=29681
+ *  https://www.microsoft.com/en-us/download/details.aspx?id=34750
  *
  * This work is licensed under the terms of the GNU LGPL, version 2 or later.
  * See the COPYING.LIB file in the top-level directory.
@@ -262,6 +262,7 @@ static int vhdx_parse_header(BlockDriverState *bs, 
BDRVVHDXState *s)
 uint64_t h2_seq = 0;
 uint8_t *buffer;
 
+/* header1 & header2 are freed in vhdx_close() */
 header1 = qemu_blockalign(bs, sizeof(VHDXHeader));
 header2 = qemu_blockalign(bs, sizeof(VHDXHeader));
 
@@ -787,6 +788,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, 
int flags)
 goto fail;
 }
 
+/* s->bat is freed in vhdx_close() */
 s->bat = qemu_blockalign(bs, s->bat_rt.length);
 
 ret = bdrv_pread(bs->file, s->bat_offset, s->bat, s->bat_rt.length);
diff --git a/block/vhdx.h b/block/vhdx.h
index fb687ed..9eb6b97 100644
--- a/block/vhdx.h
+++ b/block/vhdx.h
@@ -6,9 +6,9 @@
  * Authors:
  *  Jeff Cody 
  *
- *  This is based on the "VHDX Format Specification v0.95", published 4/12/2012
+ *  This is based on the "VHDX Format Specification v1.00", published 8/25/2012
  *  by Microsoft:
- *  https://www.microsoft.com/en-us/download/details.aspx?id=29681
+ *  https://www.microsoft.com/en-us/download/details.aspx?id=34750
  *
  * This work is licensed under the terms of the GNU LGPL, version 2 or later.
  * See the COPYING.LIB file in the top-level directory.
@@ -116,7 +116,7 @@ typedef struct QEMU_PACKED VHDXHeader {
valid. */
 uint16_tlog_version;/* version of the log format. Mustn't 
be
zero, unless log_guid is also zero 
*/
-uint16_tversion;/* version of th evhdx file.  
Currently,
+uint16_tversion;/* version of the vhdx file.  
Currently,
only supported version is "1" */
 uint32_tlog_length; /* length of the log.  Must be multiple
of 1MB */
-- 
1.8.1.4

[Qemu-devel] [PATCH v5 07/16] block: vhdx code movement - move vhdx_close() above vhdx_open()

2013-09-03 Thread Jeff Cody


Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 4dc056b..9db6531 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -777,6 +777,15 @@ exit:
 }
 
 
+static void vhdx_close(BlockDriverState *bs)
+{
+BDRVVHDXState *s = bs->opaque;
+qemu_vfree(s->headers[0]);
+qemu_vfree(s->headers[1]);
+qemu_vfree(s->bat);
+qemu_vfree(s->parent_entries);
+}
+
 static int vhdx_open(BlockDriverState *bs, QDict *options, int flags)
 {
 BDRVVHDXState *s = bs->opaque;
@@ -1027,15 +1036,6 @@ static coroutine_fn int vhdx_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 }
 
 
-static void vhdx_close(BlockDriverState *bs)
-{
-BDRVVHDXState *s = bs->opaque;
-qemu_vfree(s->headers[0]);
-qemu_vfree(s->headers[1]);
-qemu_vfree(s->bat);
-qemu_vfree(s->parent_entries);
-}
-
 static BlockDriver bdrv_vhdx = {
 .format_name= "vhdx",
 .instance_size  = sizeof(BDRVVHDXState),
-- 
1.8.1.4

Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Paul Moore

On Tuesday, September 03, 2013 05:07:53 PM Eduardo Otubo wrote:
> On 09/03/2013 03:21 PM, Paul Moore wrote:
> > On Tuesday, September 03, 2013 02:08:28 PM Corey Bryant wrote:
> >> On 09/03/2013 02:02 PM, Corey Bryant wrote:
> >>> On 08/30/2013 10:21 AM, Eduardo Otubo wrote:
>  On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:
> > On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:
> >> Now there's a second whitelist, right before the vcpu starts. The
> >> second
> >> whitelist is the same as the first one, except for exec() and
> >> select().
> > 
> > -netdev tap,downscript=/path/to/script requires exec() in the QEMU
> > shutdown code path.  Will this work with seccomp?
>  
>  I actually don't know, but I'll test that as well. Can you run a test
>  with this patch and -netdev? I mean, if you're pointing that out you
>  might have a scenario already setup, right?
>  
>  Thanks!
> >>> 
> >>> This uses exec() in net/tap.c.
> >>> 
> >>> I think if we're going to introduce a sandbox environment that restricts
> >>> existing QEMU behavior, then we have to introduce a new argument to the
> >>> -sandbox option.  So for example, "-sandbox on" would continue to use
> >>> the whitelist that allows everything in QEMU to work (or at least it
> >>> should :).  And something like "-sandbox on,strict=on" would use the
> >>> whitelist + blacklist.
> >>> 
> >>> If this is acceptable though, then I wonder how we could go about adding
> >>> new syscalls to the blacklist in future QEMU releases without regressing
> >>> "-sandbox on,strict=on".
> >> 
> >> Maybe a better approach is to provide support that allows libvirt to
> >> define the blacklist and pass it to QEMU?
> > 
> > FYI: I didn't want to mention this until I had some patches ready to post,
> > but I'm currently working on adding syscall filtering, via libseccomp, to
> > libvirt. I hope to get an initial RFC-quality patch out "soon".
> 
> Paul, if you need any help with Qemu and/or testing, please let me know.
> I would be glad to help :) When you post your RFC to libvirt mailing
> list please add me as CC.

Of course, I appreciate all the help I can get.  We can chat a bit more once 
the patches are posted.

-- 
paul moore
security and virtualization @ redhat

[Qemu-devel] [PATCH v5 02/16] block: vhdx - add header update capability.

2013-09-03 Thread Jeff Cody

This adds the ability to update the headers in a VHDX image, including
generating a new MS-compatible GUID.

As VHDX depends on uuid.h, VHDX is now a configurable build option.  If
VHDX support is enabled, that will also enable uuid as well.  The
default is to have VHDX enabled.

To enable/disable VHDX:  --enable-vhdx, --disable-vhdx

Signed-off-by: Jeff Cody 
---
 block/Makefile.objs |   2 +-
 block/vhdx.c| 161 +++-
 block/vhdx.h|  14 -
 configure   |  24 
 4 files changed, 196 insertions(+), 5 deletions(-)

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 4cf9aa4..e5e54e6 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -2,7 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o 
bochs.o vpc.o vvfat
 block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
-block-obj-y += vhdx.o
+block-obj-$(CONFIG_VHDX) += vhdx.o
 block-obj-y += parallels.o blkdebug.o blkverify.o
 block-obj-y += snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
diff --git a/block/vhdx.c b/block/vhdx.c
index 56bc88e..7bd7c12 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -21,6 +21,7 @@
 #include "qemu/crc32c.h"
 #include "block/vhdx.h"
 
+#include 
 
 /* Several metadata and region table data entries are identified by
  * guids in  a MS-specific GUID format. */
@@ -156,11 +157,40 @@ typedef struct BDRVVHDXState {
 VHDXBatEntry *bat;
 uint64_t bat_offset;
 
+MSGUID session_guid;
+
+
 VHDXParentLocatorHeader parent_header;
 VHDXParentLocatorEntry *parent_entries;
 
 } BDRVVHDXState;
 
+/* Calculates new checksum.
+ *
+ * Zero is substituted during crc calculation for the original crc field
+ * crc_offset: byte offset in buf of the buffer crc
+ * buf: buffer pointer
+ * size: size of buffer (must be > crc_offset+4)
+ *
+ * Note: The resulting checksum is in the CPU endianness, not necessarily
+ *   in the file format endianness (LE).  Any header export to disk should
+ *   make sure that vhdx_header_le_export() is used to convert to the
+ *   correct endianness
+ */
+uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
+{
+uint32_t crc;
+
+assert(buf != NULL);
+assert(size > (crc_offset + sizeof(crc)));
+
+memset(buf + crc_offset, 0, sizeof(crc));
+crc =  crc32c(0x, buf, size);
+memcpy(buf + crc_offset, &crc, sizeof(crc));
+
+return crc;
+}
+
 uint32_t vhdx_checksum_calc(uint32_t crc, uint8_t *buf, size_t size,
 int crc_offset)
 {
@@ -212,6 +242,19 @@ bool vhdx_checksum_is_valid(uint8_t *buf, size_t size, int 
crc_offset)
 
 
 /*
+ * This generates a UUID that is compliant with the MS GUIDs used
+ * in the VHDX spec (and elsewhere).
+ */
+void vhdx_guid_generate(MSGUID *guid)
+{
+uuid_t uuid;
+assert(guid != NULL);
+
+uuid_generate(uuid);
+memcpy(guid, uuid, sizeof(MSGUID));
+}
+
+/*
  * Per the MS VHDX Specification, for every VHDX file:
  *  - The header section is fixed size - 1 MB
  *  - The header section is always the first "object"
@@ -249,6 +292,113 @@ static void vhdx_header_le_import(VHDXHeader *h)
 le64_to_cpus(&h->log_offset);
 }
 
+/* All VHDX structures on disk are little endian */
+static void vhdx_header_le_export(VHDXHeader *orig_h, VHDXHeader *new_h)
+{
+assert(orig_h != NULL);
+assert(new_h != NULL);
+
+new_h->signature   = cpu_to_le32(orig_h->signature);
+new_h->checksum= cpu_to_le32(orig_h->checksum);
+new_h->sequence_number = cpu_to_le64(orig_h->sequence_number);
+
+new_h->file_write_guid = orig_h->file_write_guid;
+new_h->data_write_guid = orig_h->data_write_guid;
+new_h->log_guid= orig_h->log_guid;
+
+cpu_to_leguids(&new_h->file_write_guid);
+cpu_to_leguids(&new_h->data_write_guid);
+cpu_to_leguids(&new_h->log_guid);
+
+new_h->log_version = cpu_to_le16(orig_h->log_version);
+new_h->version = cpu_to_le16(orig_h->version);
+new_h->log_length  = cpu_to_le32(orig_h->log_length);
+new_h->log_offset  = cpu_to_le64(orig_h->log_offset);
+}
+
+/* Update the VHDX headers
+ *
+ * This follows the VHDX spec procedures for header updates.
+ *
+ *  - non-current header is updated with largest sequence number
+ */
+static int vhdx_update_header(BlockDriverState *bs, BDRVVHDXState *s,
+  bool generate_data_write_guid)
+{
+int ret = 0;
+int hdr_idx = 0;
+uint64_t header_offset = VHDX_HEADER1_OFFSET;
+
+VHDXHeader *active_header;
+VHDXHeader *inactive_header;
+VHDXHeader header_le;
+uint8_t *buffer;
+
+/* operate on the non-current header */
+if (s->curr_header == 0) {
+hdr_idx = 1;
+header_offset = VHDX_HEADER2_OFFSET;
+}
+
+active_h

[Qemu-devel] [PATCH v5 15/16] block: vhdx - fix comment typos in header, fix incorrect struct fields

2013-09-03 Thread Jeff Cody

VHDXPage83Data and VHDXParentLocatorHeader both incorrectly had their
MSGUID fields set as arrays of 16.  This is incorrect (it stems from
an early version where those fields were uint_8 arrays).  Those fields
were, up to this patch, unused.

Also, there were a couple of typos and incorrect wording in comments,
and those have been fixed up as well.

Signed-off-by: Jeff Cody 
---
 block/vhdx.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/vhdx.h b/block/vhdx.h
index d35345d..c2ba697 100644
--- a/block/vhdx.h
+++ b/block/vhdx.h
@@ -58,7 +58,7 @@
 typedef struct VHDXFileIdentifier {
 uint64_tsignature;  /* "vhdxfile" in ASCII */
 uint16_tcreator[256];   /* optional; utf-16 string to identify
-   the vhdx file creator.  Diagnotistic
+   the vhdx file creator.  Diagnostic
only */
 } VHDXFileIdentifier;
 
@@ -114,8 +114,8 @@ typedef struct QEMU_PACKED VHDXHeader {
there is no valid log. If non-zero,
log entries with this guid are
valid. */
-uint16_tlog_version;/* version of the log format. Mustn't 
be
-   zero, unless log_guid is also zero 
*/
+uint16_tlog_version;/* version of the log format. Must be
+   set to zero */
 uint16_tversion;/* version of the vhdx file.  
Currently,
only supported version is "1" */
 uint32_tlog_length; /* length of the log.  Must be multiple
@@ -281,7 +281,7 @@ typedef struct QEMU_PACKED VHDXVirtualDiskSize {
 } VHDXVirtualDiskSize;
 
 typedef struct QEMU_PACKED VHDXPage83Data {
-MSGUID  page_83_data[16];   /* unique id for scsi devices that
+MSGUID  page_83_data;   /* unique id for scsi devices that
support page 0x83 */
 } VHDXPage83Data;
 
@@ -296,7 +296,7 @@ typedef struct QEMU_PACKED 
VHDXVirtualDiskPhysicalSectorSize {
 } VHDXVirtualDiskPhysicalSectorSize;
 
 typedef struct QEMU_PACKED VHDXParentLocatorHeader {
-MSGUID  locator_type[16];   /* type of the parent virtual disk. */
+MSGUID  locator_type;   /* type of the parent virtual disk. */
 uint16_treserved;
 uint16_tkey_value_count;/* number of key/value pairs for this
locator */
-- 
1.8.1.4

[Qemu-devel] [PATCH v6 00/24] AArch64 preparation patchset

2013-09-03 Thread Peter Maydell

This patchset is v6 of the "preparation patchset" that started
off with Alex, was passed to John Rigby and now to me.

*** I plan to commit these patches (except the configs
*** patches which actually enable the aarch64 targets)
*** to target-arm.next unless there are issues raised in
*** review of this series.

We've got 1.6 out of the door now, and I expect the
aarch64-linux-user code to arrive before 1.7, so I think
the time is now right to get these preparatory patches
into master. If there's anything you want to see addressed
before then, please mention it. (My apologies if I've failed
to notice any review comments on earlier versions of the
series; if so, please flag that up.)

With these patches:
 * new target aarch64-linux-user, which will run but
   SIGILL on all instructions
 * new target aarch64-softmmu, which will run all the 32 bit
   CPUs and board models; however there is no 64 bit CPU
   defined so it's a bit pointless except as a demonstration
   that we haven't broken the 32 bit code.

Available in git at:
  git://git.linaro.org/people/pmaydell/qemu-arm.git aarch64

Changes since v5:
 * created a QOM type AArch64CPU -- all 64 bit capable CPUs
   will be subtypes of this. This means we can handle things
   like gdb set/get functions and the cpu state dump function
   by just setting the class function pointers here, which
   is a nice little cleanup.
 * fixed bogus type for sigaltstack ss_flags field
 * added endianness handling for vfp regs in signal struct
 * dropped the aarch64-fpu.xml (it was unused). I have a
   prototype patch which adds it back and actually registers
   it together with some fp reg load/save functions, but I
   don't currently have any way of testing it so it's not
   in this patchset.
 * generalised the workaround for the guest glibc barfing
   if the kernel version reports as <3.8.0 and integrated it
   better with the existing "lie to guest about version" code
 * aarch64-linux-user only provides 64 bit capable CPUs

Changes v4 to v5:
 * various bits of cleanup for style and other minor things
 * a little shuffling and splitting of patches
 * made the 32 bit CPUs work in aarch64-softmmu
 * given aarch64 its own cpu_loop() in linux-user/main.c 
 * made sure NWFPE doesn't sneak into aarch64-linux-user
 * let aarch64 have a nearly clean slate for tb_flags

Note that in general the 'signed-off-by:' lines from people
other than me should be taken to indicate credit/authorship
rather than "I'm happy with this patchset", given that I've
changed the patches as they passed through my hands.

Individual patches have a summary of my changes in the commit.

Alexander Graf (13):
  target-arm: Extract the disas struct to a header file
  target-arm: Export cpu_env
  target-arm: Fix target_ulong/uint32_t confusions
  target-arm: Prepare translation for AArch64 code
  target-arm: Add AArch64 translation stub
  target-arm: Add AArch64 gdbstub support
  linux-user: Don't treat AArch64 cpu names specially
  linux-user: Add syscall number definitions for AArch64
  linux-user: Fix up AArch64 syscall handlers
  linux-user: Implement cpu_set_tls() and cpu_clone_regs() for AArch64
  linux-user: Add AArch64 termbits.h definitions
  linux-user: Add AArch64 support
  configure: Add handling code for AArch64 targets

Andreas Schwab (1):
  linux-user: Add signal handling for AArch64

Peter Maydell (10):
  target-arm: Make '-cpu any' available in linux-user mode only
  target-arm: Abstract out load/store from a vaddr in AArch32
  target-arm: Pass DisasContext* to gen_set_pc_im()
  target-arm: Add new AArch64CPUInfo base class and subclasses
  target-arm: Disable 32 bit CPUs in 64 bit linux-user builds
  linux-user: Add cpu loop for AArch64
  linux-user: Make sure NWFPE code is 32 bit ARM only
  linux-user: Allow targets to specify a minimum uname release
  default-configs: Add config for aarch64-linux-user
  default-configs: Add config for aarch64-softmmu

 configure  |7 +-
 default-configs/aarch64-linux-user.mak |3 +
 default-configs/aarch64-softmmu.mak|   82 ++
 gdb-xml/aarch64-core.xml   |   46 
 linux-user/aarch64/syscall.h   |9 +
 linux-user/aarch64/syscall_nr.h|  323 +++
 linux-user/aarch64/target_cpu.h|   35 +++
 linux-user/aarch64/target_signal.h |   29 +++
 linux-user/aarch64/termbits.h  |  220 
 linux-user/cpu-uname.c |3 +-
 linux-user/elfload.c   |   15 +-
 linux-user/main.c  |  100 +++
 linux-user/qemu.h  |5 +-
 linux-user/signal.c|  260 +++
 linux-user/syscall.c   |   67 +++--
 linux-user/syscall_defs.h  |   28 +-
 target-arm/Makefile.objs   |1 +
 target-arm/cpu-qom.h   |   19 ++
 target-arm/cpu.c   |   21 +-
 target-arm/cpu.h   |

[Qemu-devel] [PATCH v5 03/16] block: vhdx code movement - VHDXMetadataEntries and BDRVVHDXState to header.

2013-09-03 Thread Jeff Cody

In preparation for VHDX log support, move these structures to the
header.

Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 51 ---
 block/vhdx.h | 47 +++
 2 files changed, 47 insertions(+), 51 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 7bd7c12..68648e1 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -104,16 +104,6 @@ static const MSGUID parent_vhdx_guid = { .data1 = 
0xb04aefb7,
  META_PAGE_83_PRESENT | META_LOGICAL_SECTOR_SIZE_PRESENT | \
  META_PHYS_SECTOR_SIZE_PRESENT)
 
-typedef struct VHDXMetadataEntries {
-VHDXMetadataTableEntry file_parameters_entry;
-VHDXMetadataTableEntry virtual_disk_size_entry;
-VHDXMetadataTableEntry page83_data_entry;
-VHDXMetadataTableEntry logical_sector_size_entry;
-VHDXMetadataTableEntry phys_sector_size_entry;
-VHDXMetadataTableEntry parent_locator_entry;
-uint16_t present;
-} VHDXMetadataEntries;
-
 
 typedef struct VHDXSectorInfo {
 uint32_t bat_idx;   /* BAT entry index */
@@ -124,47 +114,6 @@ typedef struct VHDXSectorInfo {
 uint64_t block_offset;  /* block offset, in bytes */
 } VHDXSectorInfo;
 
-
-
-typedef struct BDRVVHDXState {
-CoMutex lock;
-
-int curr_header;
-VHDXHeader *headers[2];
-
-VHDXRegionTableHeader rt;
-VHDXRegionTableEntry bat_rt; /* region table for the BAT */
-VHDXRegionTableEntry metadata_rt;/* region table for the metadata */
-
-VHDXMetadataTableHeader metadata_hdr;
-VHDXMetadataEntries metadata_entries;
-
-VHDXFileParameters params;
-uint32_t block_size;
-uint32_t block_size_bits;
-uint32_t sectors_per_block;
-uint32_t sectors_per_block_bits;
-
-uint64_t virtual_disk_size;
-uint32_t logical_sector_size;
-uint32_t physical_sector_size;
-
-uint64_t chunk_ratio;
-uint32_t chunk_ratio_bits;
-uint32_t logical_sector_size_bits;
-
-uint32_t bat_entries;
-VHDXBatEntry *bat;
-uint64_t bat_offset;
-
-MSGUID session_guid;
-
-
-VHDXParentLocatorHeader parent_header;
-VHDXParentLocatorEntry *parent_entries;
-
-} BDRVVHDXState;
-
 /* Calculates new checksum.
  *
  * Zero is substituted during crc calculation for the original crc field
diff --git a/block/vhdx.h b/block/vhdx.h
index 403f766..74b2d5d 100644
--- a/block/vhdx.h
+++ b/block/vhdx.h
@@ -308,6 +308,53 @@ typedef struct QEMU_PACKED VHDXParentLocatorEntry {
 
 /* - END VHDX SPECIFICATION STRUCTURES  */
 
+typedef struct VHDXMetadataEntries {
+VHDXMetadataTableEntry file_parameters_entry;
+VHDXMetadataTableEntry virtual_disk_size_entry;
+VHDXMetadataTableEntry page83_data_entry;
+VHDXMetadataTableEntry logical_sector_size_entry;
+VHDXMetadataTableEntry phys_sector_size_entry;
+VHDXMetadataTableEntry parent_locator_entry;
+uint16_t present;
+} VHDXMetadataEntries;
+
+typedef struct BDRVVHDXState {
+CoMutex lock;
+
+int curr_header;
+VHDXHeader *headers[2];
+
+VHDXRegionTableHeader rt;
+VHDXRegionTableEntry bat_rt; /* region table for the BAT */
+VHDXRegionTableEntry metadata_rt;/* region table for the metadata */
+
+VHDXMetadataTableHeader metadata_hdr;
+VHDXMetadataEntries metadata_entries;
+
+VHDXFileParameters params;
+uint32_t block_size;
+uint32_t block_size_bits;
+uint32_t sectors_per_block;
+uint32_t sectors_per_block_bits;
+
+uint64_t virtual_disk_size;
+uint32_t logical_sector_size;
+uint32_t physical_sector_size;
+
+uint64_t chunk_ratio;
+uint32_t chunk_ratio_bits;
+uint32_t logical_sector_size_bits;
+
+uint32_t bat_entries;
+VHDXBatEntry *bat;
+uint64_t bat_offset;
+
+MSGUID session_guid;
+
+VHDXParentLocatorHeader parent_header;
+VHDXParentLocatorEntry *parent_entries;
+
+} BDRVVHDXState;
 
 void vhdx_guid_generate(MSGUID *guid);
 
-- 
1.8.1.4

[Qemu-devel] [PATCH v5 10/16] block: vhdx - add log write support

2013-09-03 Thread Jeff Cody

This adds support for writing to the VHDX log.

For spec details, see VHDX Specification Format v1.00:
https://www.microsoft.com/en-us/download/details.aspx?id=34750

There are a few limitations to this log support:
1.) There is no caching yet
2.) The log is flushed after each entry

The primary write interface, vhdx_log_write_and_flush(), performs a log
write followed by an immediate flush of the log.

As each log entry sector is a minimum of 4KB, partial sector writes are
filled in with data from the disk write destination.

If the current file log GUID is 0, a new GUID is generated and updated
in the header.

Signed-off-by: Jeff Cody 
---
 block/vhdx-log.c | 276 +++
 block/vhdx.h |   3 +
 2 files changed, 279 insertions(+)

diff --git a/block/vhdx-log.c b/block/vhdx-log.c
index 10a87cc..09cb80b 100644
--- a/block/vhdx-log.c
+++ b/block/vhdx-log.c
@@ -156,6 +156,55 @@ exit:
 return ret;
 }
 
+/* Writes num_sectors to the log (all log sectors are 4096 bytes),
+ * from buffer 'buffer'.  Upon return, *sectors_written will contain
+ * the number of sectors successfully written.
+ *
+ * It is assumed that 'buffer' is at least 4096*num_sectors large.
+ *
+ * 0 is returned on success, -errno otherwise */
+static int vhdx_log_write_sectors(BlockDriverState *bs, VHDXLogEntries *log,
+  uint32_t *sectors_written, void *buffer,
+  uint32_t num_sectors)
+{
+int ret = 0;
+uint64_t offset;
+uint32_t write;
+void *buffer_tmp;
+BDRVVHDXState *s = bs->opaque;
+
+ret = vhdx_user_visible_write(bs, s);
+if (ret < 0) {
+goto exit;
+}
+
+write = log->write;
+
+buffer_tmp = buffer;
+while (num_sectors) {
+
+offset = log->offset + write;
+write = vhdx_log_inc_idx(write, log->length);
+if (write == log->read) {
+/* full */
+break;
+}
+ret = bdrv_pwrite(bs->file, offset, buffer_tmp, VHDX_LOG_SECTOR_SIZE);
+if (ret < 0) {
+goto exit;
+}
+buffer_tmp += VHDX_LOG_SECTOR_SIZE;
+
+log->write = write;
+*sectors_written = *sectors_written + 1;
+num_sectors--;
+}
+
+exit:
+return ret;
+}
+
+
 /* Validates a log entry header */
 static bool vhdx_log_hdr_is_valid(VHDXLogEntries *log, VHDXLogEntryHeader *hdr,
   BDRVVHDXState *s)
@@ -721,3 +770,230 @@ exit:
 }
 
 
+
+static void vhdx_log_raw_to_le_sector(VHDXLogDescriptor *desc,
+  VHDXLogDataSector *sector, void *data,
+  uint64_t seq)
+{
+/* 8 + 4084 + 4 = 4096, 1 log sector */
+memcpy(&desc->leading_bytes, data, 8);
+data += 8;
+cpu_to_le64s(&desc->leading_bytes);
+memcpy(sector->data, data, 4084);
+data += 4084;
+memcpy(&desc->trailing_bytes, data, 4);
+cpu_to_le32s(&desc->trailing_bytes);
+data += 4;
+
+sector->sequence_high  = (uint32_t) (seq >> 32);
+sector->sequence_low   = (uint32_t) (seq & 0x);
+sector->data_signature = VHDX_LOG_DATA_SIGNATURE;
+
+vhdx_log_desc_le_export(desc);
+vhdx_log_data_le_export(sector);
+}
+
+
+static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
+  void *data, uint32_t length, uint64_t offset)
+{
+int ret = 0;
+void *buffer = NULL;
+void *merged_sector = NULL;
+void *data_tmp, *sector_write;
+unsigned int i;
+int sector_offset;
+uint32_t desc_sectors, sectors, total_length;
+uint32_t sectors_written = 0;
+uint32_t aligned_length;
+uint32_t leading_length = 0;
+uint32_t trailing_length = 0;
+uint32_t partial_sectors = 0;
+uint32_t bytes_written = 0;
+uint64_t file_offset;
+VHDXHeader *header;
+VHDXLogEntryHeader new_hdr;
+VHDXLogDescriptor *new_desc = NULL;
+VHDXLogDataSector *data_sector = NULL;
+MSGUID new_guid = { 0 };
+
+header = s->headers[s->curr_header];
+
+/* need to have offset read data, and be on 4096 byte boundary */
+
+if (length > header->log_length) {
+/* no log present.  we could create a log here instead of failing */
+ret = -EINVAL;
+goto exit;
+}
+
+if (guid_eq(header->log_guid, zero_guid)) {
+vhdx_guid_generate(&new_guid);
+vhdx_update_headers(bs, s, false, &new_guid);
+} else {
+/* currently, we require that the log be flushed after
+ * every write. */
+ret = -ENOTSUP;
+}
+
+/* 0 is an invalid sequence number, but may also represent the first
+ * log write (or a wrapped seq) */
+if (s->log.sequence == 0) {
+s->log.sequence = 1;
+}
+
+sector_offset = offset % VHDX_LOG_SECTOR_SIZE;
+file_offset = (offset / VHDX_LOG_SECTOR_SIZE) * VHDX_LOG_SECTOR_SIZE;
+
+aligned_length = length;
+
+/* add in the unaligned head and tail bytes */
+

[Qemu-devel] [PATCH v5 08/16] block: vhdx - log parsing, replay, and flush support

2013-09-03 Thread Jeff Cody

This adds support for VHDX v0 logs, as specified in Microsoft's
VHDX Specification Format v1.00:
https://www.microsoft.com/en-us/download/details.aspx?id=34750

The following support is added:

* Log parsing, and validation - validate that an existing log
  is correct.

* Log search - search through an existing log, to find any valid
  sequence of entries.

* Log replay and flush - replay an existing log, and flush/clear
  the log when complete.

The VHDX log is a circular buffer, with elements (sectors) of 4KB.

A log entry is a variably-length number of sectors, that is
comprised of a header and 'descriptors', that describe each sector.

A log may contain multiple entries, know as a log sequence.  In a log
sequence, each log entry immediately follows the previous entry, with an
incrementing sequence number.  There can only ever be one active and
valid sequence in the log.

Each log entry must match the file log GUID in order to be valid (along
with other criteria).  Once we have flushed all valid log entries, we
marked the file log GUID to be zero, which indicates a buffer with no
valid entries.

Signed-off-by: Jeff Cody 
---
 block/Makefile.objs |   2 +-
 block/vhdx-log.c| 723 
 block/vhdx.c|  69 ++---
 block/vhdx.h|   7 +-
 4 files changed, 750 insertions(+), 51 deletions(-)
 create mode 100644 block/vhdx-log.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index e6f5d33..2fbd79a 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -2,7 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o 
bochs.o vpc.o vvfat
 block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
-block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o
+block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
 block-obj-y += parallels.o blkdebug.o blkverify.o
 block-obj-y += snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
diff --git a/block/vhdx-log.c b/block/vhdx-log.c
new file mode 100644
index 000..10a87cc
--- /dev/null
+++ b/block/vhdx-log.c
@@ -0,0 +1,723 @@
+/*
+ * Block driver for Hyper-V VHDX Images
+ *
+ * Copyright (c) 2013 Red Hat, Inc.,
+ *
+ * Authors:
+ *  Jeff Cody 
+ *
+ *  This is based on the "VHDX Format Specification v1.00", published 8/25/2012
+ *  by Microsoft:
+ *  https://www.microsoft.com/en-us/download/details.aspx?id=34750
+ *
+ * This file covers the functionality of the metadata log writing, parsing, and
+ * replay.
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+#include "qemu-common.h"
+#include "block/block_int.h"
+#include "qemu/module.h"
+#include "block/vhdx.h"
+
+
+typedef struct VHDXLogSequence {
+bool valid;
+uint32_t count;
+VHDXLogEntries log;
+VHDXLogEntryHeader hdr;
+} VHDXLogSequence;
+
+typedef struct VHDXLogDescEntries {
+VHDXLogEntryHeader hdr;
+VHDXLogDescriptor desc[];
+} VHDXLogDescEntries;
+
+static const MSGUID zero_guid = { 0 };
+
+/* The log located on the disk is circular buffer containing
+ * sectors of 4096 bytes each.
+ *
+ * It is assumed for the read/write functions below that the
+ * circular buffer scheme uses a 'one sector open' to indicate
+ * the buffer is full.  Given the validation methods used for each
+ * sector, this method should be compatible with other methods that
+ * do not waste a sector.
+ */
+
+
+/* Allow peeking at the hdr entry at the beginning of the current
+ * read index, without advancing the read index */
+static int vhdx_log_peek_hdr(BlockDriverState *bs, VHDXLogEntries *log,
+ VHDXLogEntryHeader *hdr)
+{
+int ret = 0;
+uint64_t offset;
+uint32_t read;
+
+assert(hdr != NULL);
+
+/* peek is only supported on sector boundaries */
+if (log->read % VHDX_LOG_SECTOR_SIZE) {
+ret = -EFAULT;
+goto exit;
+}
+
+read = log->read;
+/* we are guaranteed that a) log sectors are 4096 bytes,
+ * and b) the log length is a multiple of 1MB. So, there
+ * is always a round number of sectors in the buffer */
+if ((read + sizeof(VHDXLogEntryHeader)) > log->length) {
+read = 0;
+}
+
+if (read == log->write) {
+ret = -EINVAL;
+goto exit;
+}
+
+offset = log->offset + read;
+
+ret = bdrv_pread(bs->file, offset, hdr, sizeof(VHDXLogEntryHeader));
+if (ret < 0) {
+goto exit;
+}
+
+exit:
+return ret;
+}
+
+/* Index increment for log, based on sector boundaries */
+static int vhdx_log_inc_idx(uint32_t idx, uint64_t length)
+{
+idx += VHDX_LOG_SECTOR_SIZE;
+/* we are guaranteed that a) log sectors are 4096 bytes,
+ * and b) the log length is a multiple of 1MB. So, there
+ * is always a round number of sectors in the buffer *

Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Eduardo Otubo




On 09/03/2013 03:21 PM, Paul Moore wrote:

On Tuesday, September 03, 2013 02:08:28 PM Corey Bryant wrote:

On 09/03/2013 02:02 PM, Corey Bryant wrote:

On 08/30/2013 10:21 AM, Eduardo Otubo wrote:

On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:

On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:

Now there's a second whitelist, right before the vcpu starts. The
second
whitelist is the same as the first one, except for exec() and select().


-netdev tap,downscript=/path/to/script requires exec() in the QEMU
shutdown code path.  Will this work with seccomp?


I actually don't know, but I'll test that as well. Can you run a test
with this patch and -netdev? I mean, if you're pointing that out you
might have a scenario already setup, right?

Thanks!


This uses exec() in net/tap.c.

I think if we're going to introduce a sandbox environment that restricts
existing QEMU behavior, then we have to introduce a new argument to the
-sandbox option.  So for example, "-sandbox on" would continue to use
the whitelist that allows everything in QEMU to work (or at least it
should :).  And something like "-sandbox on,strict=on" would use the
whitelist + blacklist.

If this is acceptable though, then I wonder how we could go about adding
new syscalls to the blacklist in future QEMU releases without regressing
"-sandbox on,strict=on".


Maybe a better approach is to provide support that allows libvirt to
define the blacklist and pass it to QEMU?


FYI: I didn't want to mention this until I had some patches ready to post, but
I'm currently working on adding syscall filtering, via libseccomp, to libvirt.
I hope to get an initial RFC-quality patch out "soon".



Paul, if you need any help with Qemu and/or testing, please let me know. 
I would be glad to help :) When you post your RFC to libvirt mailing 
list please add me as CC.


--
Eduardo Otubo
IBM Linux Technology Center

Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Eduardo Otubo




On 09/03/2013 03:02 PM, Corey Bryant wrote:



On 08/30/2013 10:21 AM, Eduardo Otubo wrote:



On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:

On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:

Now there's a second whitelist, right before the vcpu starts. The
second
whitelist is the same as the first one, except for exec() and select().


-netdev tap,downscript=/path/to/script requires exec() in the QEMU
shutdown code path.  Will this work with seccomp?


I actually don't know, but I'll test that as well. Can you run a test
with this patch and -netdev? I mean, if you're pointing that out you
might have a scenario already setup, right?

Thanks!



This uses exec() in net/tap.c.

I think if we're going to introduce a sandbox environment that restricts
existing QEMU behavior, then we have to introduce a new argument to the
-sandbox option.  So for example, "-sandbox on" would continue to use
the whitelist that allows everything in QEMU to work (or at least it
should :).  And something like "-sandbox on,strict=on" would use the
whitelist + blacklist.


I think tihs is very reasonable. I'll working on implementing this 
options for v2.




If this is acceptable though, then I wonder how we could go about adding
new syscalls to the blacklist in future QEMU releases without regressing
"-sandbox on,strict=on".

By the way, are any test buckets running regularly with -sandbox on?


I am running tests with virt-test. Need to come up with a script that 
checks for unused syscalls, though.


--
Eduardo Otubo
IBM Linux Technology Center

[Qemu-devel] [PATCH v5 05/16] block: vhdx - break endian translation functions out

2013-09-03 Thread Jeff Cody

This moves the endian translation functions out from the vhdx.c source,
into a separate source file. In addition to the previously defined
endian functions, new endian translation functions for log support are
added as well.

Signed-off-by: Jeff Cody 
---
 block/Makefile.objs |   2 +-
 block/vhdx-endian.c | 141 
 block/vhdx.c|  43 
 block/vhdx.h|   8 +++
 4 files changed, 150 insertions(+), 44 deletions(-)
 create mode 100644 block/vhdx-endian.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index e5e54e6..e6f5d33 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -2,7 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o 
bochs.o vpc.o vvfat
 block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
-block-obj-$(CONFIG_VHDX) += vhdx.o
+block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o
 block-obj-y += parallels.o blkdebug.o blkverify.o
 block-obj-y += snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
diff --git a/block/vhdx-endian.c b/block/vhdx-endian.c
new file mode 100644
index 000..3e93e63
--- /dev/null
+++ b/block/vhdx-endian.c
@@ -0,0 +1,141 @@
+/*
+ * Block driver for Hyper-V VHDX Images
+ *
+ * Copyright (c) 2013 Red Hat, Inc.,
+ *
+ * Authors:
+ *  Jeff Cody 
+ *
+ *  This is based on the "VHDX Format Specification v1.00", published 8/25/2012
+ *  by Microsoft:
+ *  https://www.microsoft.com/en-us/download/details.aspx?id=34750
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "block/block_int.h"
+#include "block/vhdx.h"
+
+#include 
+
+
+/*
+ * All the VHDX formats on disk are little endian - the following
+ * are helper import/export functions to correctly convert
+ * endianness from disk read to native cpu format, and back again.
+ */
+
+
+/* VHDX File Header */
+
+
+void vhdx_header_le_import(VHDXHeader *h)
+{
+assert(h != NULL);
+
+le32_to_cpus(&h->signature);
+le32_to_cpus(&h->checksum);
+le64_to_cpus(&h->sequence_number);
+
+leguid_to_cpus(&h->file_write_guid);
+leguid_to_cpus(&h->data_write_guid);
+leguid_to_cpus(&h->log_guid);
+
+le16_to_cpus(&h->log_version);
+le16_to_cpus(&h->version);
+le32_to_cpus(&h->log_length);
+le64_to_cpus(&h->log_offset);
+}
+
+void vhdx_header_le_export(VHDXHeader *orig_h, VHDXHeader *new_h)
+{
+assert(orig_h != NULL);
+assert(new_h != NULL);
+
+new_h->signature   = cpu_to_le32(orig_h->signature);
+new_h->checksum= cpu_to_le32(orig_h->checksum);
+new_h->sequence_number = cpu_to_le64(orig_h->sequence_number);
+
+new_h->file_write_guid = orig_h->file_write_guid;
+new_h->data_write_guid = orig_h->data_write_guid;
+new_h->log_guid= orig_h->log_guid;
+
+cpu_to_leguids(&new_h->file_write_guid);
+cpu_to_leguids(&new_h->data_write_guid);
+cpu_to_leguids(&new_h->log_guid);
+
+new_h->log_version = cpu_to_le16(orig_h->log_version);
+new_h->version = cpu_to_le16(orig_h->version);
+new_h->log_length  = cpu_to_le32(orig_h->log_length);
+new_h->log_offset  = cpu_to_le64(orig_h->log_offset);
+}
+
+
+/* VHDX Log Headers */
+
+
+void vhdx_log_desc_le_import(VHDXLogDescriptor *d)
+{
+assert(d != NULL);
+
+le32_to_cpus(&d->signature);
+le32_to_cpus(&d->trailing_bytes);
+le64_to_cpus(&d->leading_bytes);
+le64_to_cpus(&d->file_offset);
+le64_to_cpus(&d->sequence_number);
+}
+
+void vhdx_log_desc_le_export(VHDXLogDescriptor *d)
+{
+assert(d != NULL);
+
+cpu_to_le32s(&d->signature);
+cpu_to_le32s(&d->trailing_bytes);
+cpu_to_le64s(&d->leading_bytes);
+cpu_to_le64s(&d->file_offset);
+cpu_to_le64s(&d->sequence_number);
+}
+
+void vhdx_log_data_le_export(VHDXLogDataSector *d)
+{
+assert(d != NULL);
+
+cpu_to_le32s(&d->data_signature);
+cpu_to_le32s(&d->sequence_high);
+cpu_to_le32s(&d->sequence_low);
+}
+
+void vhdx_log_entry_hdr_le_import(VHDXLogEntryHeader *hdr)
+{
+assert(hdr != NULL);
+
+le32_to_cpus(&hdr->signature);
+le32_to_cpus(&hdr->checksum);
+le32_to_cpus(&hdr->entry_length);
+le32_to_cpus(&hdr->tail);
+le64_to_cpus(&hdr->sequence_number);
+le32_to_cpus(&hdr->descriptor_count);
+leguid_to_cpus(&hdr->log_guid);
+le64_to_cpus(&hdr->flushed_file_offset);
+le64_to_cpus(&hdr->last_file_offset);
+}
+
+void vhdx_log_entry_hdr_le_export(VHDXLogEntryHeader *hdr)
+{
+assert(hdr != NULL);
+
+cpu_to_le32s(&hdr->signature);
+cpu_to_le32s(&hdr->checksum);
+cpu_to_le32s(&hdr->entry_length);
+cpu_to_le32s(&hdr->tail);
+cpu_to_le64s(&hdr->sequence_number);
+cpu_to_le32s(&hdr->descriptor_count);
+cpu_to_leguid

[Qemu-devel] [PATCH v5 16/16] block: vhdx - add .bdrv_create() support

2013-09-03 Thread Jeff Cody

This adds support for VHDX image creation, for images of type "Fixed"
and "Dynamic".  "Differencing" types (i.e., VHDX images with backing
files) are currently not supported.

Options for image creation include:
* log size:
The size of the journaling log for VHDX.  Minimum is 1MB,
and it must be a multiple of 1MB. Invalid log sizes will be
silently fixed by rounding up to the nearest MB.

Default is 1MB.

* block size:
This is the size of a payload block.  The range is 1MB to 256MB,
inclusive, and must be a multiple of 1MB as well.  Invalid sizes
and multiples will be silently fixed.  If '0' is passed, then
a sane size is chosen (depending on virtual image size).

Default is 0 (Auto-select).

* subformat:
- "dynamic"
An image without data pre-allocated.
- "fixed"
An image with data pre-allocated.

Default is "dynamic"

When creating the image file, the lettered sections are created:

-.
|   (A)|   (B)|(C)| (D)   | (E)
|  File ID |  Header1 |  Header 2 |  Region Tbl 1 |  Region Tbl 2
|  |  |   |   |
.-.
0 64KB  128KB   192KB   256KB  320KB

. ~ --- ~  ~  ~ ---.
| (F) | (G)   |(H)|
| Journal Log |  BAT / Bitmap |  Metadata |   data ..
| |   |   |
. ~ --- ~  ~  ~ ---.
1MB (var.)  (var.)  (var.)

Signed-off-by: Jeff Cody 
---
 block/vhdx.c | 532 +++
 block/vhdx.h |  15 +-
 2 files changed, 546 insertions(+), 1 deletion(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 94fa84f..4998707 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -22,6 +22,18 @@
 #include "block/vhdx.h"
 
 #include 
+#include 
+
+/* Options for VHDX creation */
+
+#define VHDX_BLOCK_OPT_LOG_SIZE   "log_size"
+#define VHDX_BLOCK_OPT_BLOCK_SIZE "block_size"
+
+typedef enum VHDXImageType {
+VHDX_TYPE_DYNAMIC = 0,
+VHDX_TYPE_FIXED,
+VHDX_TYPE_DIFFERENCING,   /* Currently unsupported */
+} VHDXImageType;
 
 /* Several metadata and region table data entries are identified by
  * guids in  a MS-specific GUID format. */
@@ -1261,6 +1273,523 @@ exit:
 }
 
 
+
+/*
+ * Create VHDX Headers
+ *
+ * There are 2 headers, and the highest sequence number will represent
+ * the active header
+ */
+static int vhdx_create_new_headers(BlockDriverState *bs, uint64_t image_size,
+   uint32_t log_size)
+{
+int ret = 0;
+VHDXHeader *hdr = NULL;
+
+hdr = g_malloc0(sizeof(VHDXHeader));
+
+hdr->signature   = VHDX_HEADER_SIGNATURE;
+hdr->sequence_number = g_random_int();
+hdr->log_version = 0;
+hdr->version = 1;
+hdr->log_length  = log_size;
+hdr->log_offset  = VHDX_HEADER_SECTION_END;
+vhdx_guid_generate(&hdr->file_write_guid);
+vhdx_guid_generate(&hdr->data_write_guid);
+
+ret = vhdx_write_header(bs, hdr, VHDX_HEADER1_OFFSET, false);
+if (ret < 0) {
+goto exit;
+}
+hdr->sequence_number++;
+ret = vhdx_write_header(bs, hdr, VHDX_HEADER2_OFFSET, false);
+if (ret < 0) {
+goto exit;
+}
+
+exit:
+g_free(hdr);
+return ret;
+}
+
+
+/*
+ * Create the Metadata entries.
+ *
+ * For more details on the entries, see section 3.5 (pg 29) in the
+ * VHDX 1.00 specification.
+ *
+ * We support 5 metadata entries (all required by spec):
+ *  File Parameters,
+ *  Virtual Disk Size,
+ *  Page 83 Data,
+ *  Logical Sector Size,
+ *  Physical Sector Size
+ *
+ * The first 64KB of the Metadata section is reserved for the metadata
+ * header and entries; beyond that, the metadata items themselves reside.
+ */
+static int vhdx_create_new_metadata(BlockDriverState *bs,
+uint64_t image_size,
+uint32_t block_size,
+uint32_t sector_size,
+uint64_t metadata_offset,
+VHDXImageType type)
+{
+int ret = 0;
+uint32_t offset = 0;
+void *buffer = NULL;
+void *entry_buffer;
+VHDXMetadataTableHeader *md_table;;
+VHDXMetadataTableEntry  *md_table_entry;
+
+/* Metadata entries */
+VHDXFileParameters *mt_file_params;
+VHDXVirtualDiskSize*mt_virtual_size;
+VHDXPage83Data *mt_page83;
+VHDXVirtualDiskLogicalSectorSize  *mt_log_sector_size;
+VHDXVirtualDiskPhysicalSectorSize *mt_phys_sector_size;
+
+entry_buffer = g_malloc0(sizeof(VHDXFileParameters)   +
+

[Qemu-devel] [PATCH v6 07/24] target-arm: Add new AArch64CPUInfo base class and subclasses

2013-09-03 Thread Peter Maydell

Create a new AArch64CPU class; all 64-bit capable ARM
CPUs are subclasses of this. (Currently we only support
one, the "any" CPU used by linux-user.)

Signed-off-by: Peter Maydell 
---
 target-arm/Makefile.objs |1 +
 target-arm/cpu-qom.h |   12 +
 target-arm/cpu64.c   |  111 ++
 3 files changed, 124 insertions(+)
 create mode 100644 target-arm/cpu64.c

diff --git a/target-arm/Makefile.objs b/target-arm/Makefile.objs
index 2d9f77f..baebc50 100644
--- a/target-arm/Makefile.objs
+++ b/target-arm/Makefile.objs
@@ -5,3 +5,4 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-y += translate.o op_helper.o helper.o cpu.o
 obj-y += neon_helper.o iwmmxt_helper.o
 obj-y += gdbstub.o
+obj-$(TARGET_AARCH64) += cpu64.o
diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index 9f47bae..fbe846e 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -130,6 +130,18 @@ typedef struct ARMCPU {
 uint32_t reset_auxcr;
 } ARMCPU;
 
+#define TYPE_AARCH64_CPU "aarch64-cpu"
+#define AARCH64_CPU_CLASS(klass) \
+OBJECT_CLASS_CHECK(AArch64CPUClass, (klass), TYPE_AARCH64_CPU)
+#define AARCH64_CPU_GET_CLASS(obj) \
+OBJECT_GET_CLASS(AArch64CPUClass, (obj), TYPE_AArch64_CPU)
+
+typedef struct AArch64CPUClass {
+/*< private >*/
+ARMCPUClass parent_class;
+/*< public >*/
+} AArch64CPUClass;
+
 static inline ARMCPU *arm_env_get_cpu(CPUARMState *env)
 {
 return container_of(env, ARMCPU, env);
diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
new file mode 100644
index 000..faee0f0
--- /dev/null
+++ b/target-arm/cpu64.c
@@ -0,0 +1,111 @@
+/*
+ * QEMU AArch64 CPU
+ *
+ * Copyright (c) 2013 Linaro Ltd
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see
+ * 
+ */
+
+#include "cpu.h"
+#include "qemu-common.h"
+#if !defined(CONFIG_USER_ONLY)
+#include "hw/loader.h"
+#endif
+#include "hw/arm/arm.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
+
+static inline void set_feature(CPUARMState *env, int feature)
+{
+env->features |= 1ULL << feature;
+}
+
+#ifdef CONFIG_USER_ONLY
+static void aarch64_any_initfn(Object *obj)
+{
+ARMCPU *cpu = ARM_CPU(obj);
+
+set_feature(&cpu->env, ARM_FEATURE_V8);
+set_feature(&cpu->env, ARM_FEATURE_VFP4);
+set_feature(&cpu->env, ARM_FEATURE_VFP_FP16);
+set_feature(&cpu->env, ARM_FEATURE_NEON);
+set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
+set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
+set_feature(&cpu->env, ARM_FEATURE_V7MP);
+set_feature(&cpu->env, ARM_FEATURE_AARCH64);
+}
+#endif
+
+typedef struct ARMCPUInfo {
+const char *name;
+void (*initfn)(Object *obj);
+void (*class_init)(ObjectClass *oc, void *data);
+} ARMCPUInfo;
+
+static const ARMCPUInfo aarch64_cpus[] = {
+#ifdef CONFIG_USER_ONLY
+{ .name = "any", .initfn = aarch64_any_initfn },
+#endif
+};
+
+static void aarch64_cpu_initfn(Object *obj)
+{
+}
+
+static void aarch64_cpu_finalizefn(Object *obj)
+{
+}
+
+static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
+{
+}
+
+static void aarch64_cpu_register(const ARMCPUInfo *info)
+{
+TypeInfo type_info = {
+.parent = TYPE_AARCH64_CPU,
+.instance_size = sizeof(ARMCPU),
+.instance_init = info->initfn,
+.class_size = sizeof(ARMCPUClass),
+.class_init = info->class_init,
+};
+
+type_info.name = g_strdup_printf("%s-" TYPE_ARM_CPU, info->name);
+type_register(&type_info);
+g_free((void *)type_info.name);
+}
+
+static const TypeInfo aarch64_cpu_type_info = {
+.name = TYPE_AARCH64_CPU,
+.parent = TYPE_ARM_CPU,
+.instance_size = sizeof(ARMCPU),
+.instance_init = aarch64_cpu_initfn,
+.instance_finalize = aarch64_cpu_finalizefn,
+.abstract = true,
+.class_size = sizeof(AArch64CPUClass),
+.class_init = aarch64_cpu_class_init,
+};
+
+static void aarch64_cpu_register_types(void)
+{
+int i;
+
+type_register_static(&aarch64_cpu_type_info);
+for (i = 0; i < ARRAY_SIZE(aarch64_cpus); i++) {
+aarch64_cpu_register(&aarch64_cpus[i]);
+}
+}
+
+type_init(aarch64_cpu_register_types)
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 22/24] configure: Add handling code for AArch64 targets

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

Add the necessary code to configure to handle AArch64 as a target
CPU (we already have some code for supporting it as host). Note
that this doesn't enable the AArch64 targets yet.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-12-git-send-email-john.ri...@linaro.org
[PMM:
 * don't need to set TARGET_ABI_DIR to aarch64 as that is the default
 * don't build nwfpe -- this is 32 bit legacy only
 * rewrite commit message
 * add aarch64 to the list of "fdt required" targets
]
Signed-off-by: Peter Maydell 
---
 configure |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index af6b048..e20869b 100755
--- a/configure
+++ b/configure
@@ -2523,7 +2523,7 @@ fi
 fdt_required=no
 for target in $target_list; do
   case $target in
-arm*-softmmu|ppc*-softmmu|microblaze*-softmmu)
+aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu)
   fdt_required=yes
 ;;
   esac
@@ -4273,6 +4273,11 @@ case "$target_name" in
 bflt="yes"
 gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
   ;;
+  aarch64)
+TARGET_BASE_ARCH=arm
+bflt="yes"
+gdb_xml_files="aarch64-core.xml"
+  ;;
   cris)
   ;;
   lm32)
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 14/24] linux-user: Add syscall number definitions for AArch64

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

The AArch64 syscall definitions are all publicly available in the Linux
kernel. Let's add them to our linux-user emulation target, so that we
can easily handle AArch64 syscalls.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-8-git-send-email-john.ri...@linaro.org
[PMM: changes relating to cpu_loop() removed as they are superseded
 by an earlier patch]
Signed-off-by: Peter Maydell 
---
 linux-user/aarch64/syscall_nr.h |  323 +++
 1 file changed, 323 insertions(+)
 create mode 100644 linux-user/aarch64/syscall_nr.h

diff --git a/linux-user/aarch64/syscall_nr.h b/linux-user/aarch64/syscall_nr.h
new file mode 100644
index 000..743255d
--- /dev/null
+++ b/linux-user/aarch64/syscall_nr.h
@@ -0,0 +1,323 @@
+/*
+ * This file contains the system call numbers.
+ */
+
+#define TARGET_NR_io_setup 0
+#define TARGET_NR_io_destroy 1
+#define TARGET_NR_io_submit 2
+#define TARGET_NR_io_cancel 3
+#define TARGET_NR_io_getevents 4
+#define TARGET_NR_setxattr 5
+#define TARGET_NR_lsetxattr 6
+#define TARGET_NR_fsetxattr 7
+#define TARGET_NR_getxattr 8
+#define TARGET_NR_lgetxattr 9
+#define TARGET_NR_fgetxattr 10
+#define TARGET_NR_listxattr 11
+#define TARGET_NR_llistxattr 12
+#define TARGET_NR_flistxattr 13
+#define TARGET_NR_removexattr 14
+#define TARGET_NR_lremovexattr 15
+#define TARGET_NR_fremovexattr 16
+#define TARGET_NR_getcwd 17
+#define TARGET_NR_lookup_dcookie 18
+#define TARGET_NR_eventfd2 19
+#define TARGET_NR_epoll_create1 20
+#define TARGET_NR_epoll_ctl 21
+#define TARGET_NR_epoll_pwait 22
+#define TARGET_NR_dup 23
+#define TARGET_NR_dup3 24
+#define TARGET_NR_fcntl 25
+#define TARGET_NR_inotify_init1 26
+#define TARGET_NR_inotify_add_watch 27
+#define TARGET_NR_inotify_rm_watch 28
+#define TARGET_NR_ioctl 29
+#define TARGET_NR_ioprio_set 30
+#define TARGET_NR_ioprio_get 31
+#define TARGET_NR_flock 32
+#define TARGET_NR_mknodat 33
+#define TARGET_NR_mkdirat 34
+#define TARGET_NR_unlinkat 35
+#define TARGET_NR_symlinkat 36
+#define TARGET_NR_linkat 37
+#define TARGET_NR_renameat 38
+#define TARGET_NR_umount2 39
+#define TARGET_NR_mount 40
+#define TARGET_NR_pivot_root 41
+#define TARGET_NR_nfsservctl 42
+#define TARGET_NR_statfs 43
+#define TARGET_NR_fstatfs 44
+#define TARGET_NR_truncate 45
+#define TARGET_NR_ftruncate 46
+#define TARGET_NR_fallocate 47
+#define TARGET_NR_faccessat 48
+#define TARGET_NR_chdir 49
+#define TARGET_NR_fchdir 50
+#define TARGET_NR_chroot 51
+#define TARGET_NR_fchmod 52
+#define TARGET_NR_fchmodat 53
+#define TARGET_NR_fchownat 54
+#define TARGET_NR_fchown 55
+#define TARGET_NR_openat 56
+#define TARGET_NR_close 57
+#define TARGET_NR_vhangup 58
+#define TARGET_NR_pipe2 59
+#define TARGET_NR_quotactl 60
+#define TARGET_NR_getdents64 61
+#define TARGET_NR_lseek 62
+#define TARGET_NR_read 63
+#define TARGET_NR_write 64
+#define TARGET_NR_readv 65
+#define TARGET_NR_writev 66
+#define TARGET_NR_pread64 67
+#define TARGET_NR_pwrite64 68
+#define TARGET_NR_preadv 69
+#define TARGET_NR_pwritev 70
+#define TARGET_NR_sendfile 71
+#define TARGET_NR_pselect6 72
+#define TARGET_NR_ppoll 73
+#define TARGET_NR_signalfd4 74
+#define TARGET_NR_vmsplice 75
+#define TARGET_NR_splice 76
+#define TARGET_NR_tee 77
+#define TARGET_NR_readlinkat 78
+#define TARGET_NR_fstatat64 79
+#define TARGET_NR_fstat 80
+#define TARGET_NR_sync 81
+#define TARGET_NR_fsync 82
+#define TARGET_NR_fdatasync 83
+#define TARGET_NR_sync_file_range2 84
+/* #define TARGET_NR_sync_file_range 84 */
+#define TARGET_NR_timerfd_create 85
+#define TARGET_NR_timerfd_settime 86
+#define TARGET_NR_timerfd_gettime 87
+#define TARGET_NR_utimensat 88
+#define TARGET_NR_acct 89
+#define TARGET_NR_capget 90
+#define TARGET_NR_capset 91
+#define TARGET_NR_personality 92
+#define TARGET_NR_exit 93
+#define TARGET_NR_exit_group 94
+#define TARGET_NR_waitid 95
+#define TARGET_NR_set_tid_address 96
+#define TARGET_NR_unshare 97
+#define TARGET_NR_futex 98
+#define TARGET_NR_set_robust_list 99
+#define TARGET_NR_get_robust_list 100
+#define TARGET_NR_nanosleep 101
+#define TARGET_NR_getitimer 102
+#define TARGET_NR_setitimer 103
+#define TARGET_NR_kexec_load 104
+#define TARGET_NR_init_module 105
+#define TARGET_NR_delete_module 106
+#define TARGET_NR_timer_create 107
+#define TARGET_NR_timer_gettime 108
+#define TARGET_NR_timer_getoverrun 109
+#define TARGET_NR_timer_settime 110
+#define TARGET_NR_timer_delete 111
+#define TARGET_NR_clock_settime 112
+#define TARGET_NR_clock_gettime 113
+#define TARGET_NR_clock_getres 114
+#define TARGET_NR_clock_nanosleep 115
+#define TARGET_NR_syslog 116
+#define TARGET_NR_ptrace 117
+#define TARGET_NR_sched_setparam 118
+#define TARGET_NR_sched_setscheduler 119
+#define TARGET_NR_sched_getscheduler 120
+#define TARGET_NR_sched_getparam 121
+#define TARGET_NR_sched_setaffinity 122
+#define TARGET_NR_sched_getaffinity 123
+#define TARGET_NR_sched_yield 124
+#define TARGET_NR_sched_get_priority_max 125
+#defin

[Qemu-devel] [PATCH v6 16/24] linux-user: Add signal handling for AArch64

2013-09-03 Thread Peter Maydell

From: Andreas Schwab 

This patch adds signal handling for AArch64. The code is based on the
respective source in the Linux kernel.

Signed-off-by: Andreas Schwab 
Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-10-git-send-email-john.ri...@linaro.org
[PMM: fixed style nits: tabs, long lines;
 pulled target_signal.h in from a later patch; it fits better here]
Signed-off-by: Peter Maydell 
---
 linux-user/aarch64/target_signal.h |   29 
 linux-user/signal.c|  260 
 2 files changed, 289 insertions(+)
 create mode 100644 linux-user/aarch64/target_signal.h

diff --git a/linux-user/aarch64/target_signal.h 
b/linux-user/aarch64/target_signal.h
new file mode 100644
index 000..e8c677d
--- /dev/null
+++ b/linux-user/aarch64/target_signal.h
@@ -0,0 +1,29 @@
+#ifndef TARGET_SIGNAL_H
+#define TARGET_SIGNAL_H
+
+#include "cpu.h"
+
+/* this struct defines a stack used during syscall handling */
+
+typedef struct target_sigaltstack {
+abi_ulong ss_sp;
+abi_int ss_flags;
+abi_ulong ss_size;
+} target_stack_t;
+
+
+/*
+ * sigaltstack controls
+ */
+#define TARGET_SS_ONSTACK 1
+#define TARGET_SS_DISABLE 2
+
+#define TARGET_MINSIGSTKSZ 2048
+#define TARGET_SIGSTKSZ 8192
+
+static inline abi_ulong get_sp_from_cpustate(CPUARMState *state)
+{
+   return state->xregs[31];
+}
+
+#endif /* TARGET_SIGNAL_H */
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 23d65da..7751c47 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -1092,6 +1092,266 @@ badframe:
return 0;
 }
 
+#elif defined(TARGET_AARCH64)
+
+struct target_sigcontext {
+uint64_t fault_address;
+/* AArch64 registers */
+uint64_t regs[31];
+uint64_t sp;
+uint64_t pc;
+uint64_t pstate;
+/* 4K reserved for FP/SIMD state and future expansion */
+char __reserved[4096] __attribute__((__aligned__(16)));
+};
+
+struct target_ucontext {
+abi_ulong tuc_flags;
+abi_ulong tuc_link;
+target_stack_t tuc_stack;
+target_sigset_t tuc_sigmask;
+/* glibc uses a 1024-bit sigset_t */
+char __unused[1024 / 8 - sizeof(target_sigset_t)];
+/* last for future expansion */
+struct target_sigcontext tuc_mcontext;
+};
+
+/*
+ * Header to be used at the beginning of structures extending the user
+ * context. Such structures must be placed after the rt_sigframe on the stack
+ * and be 16-byte aligned. The last structure must be a dummy one with the
+ * magic and size set to 0.
+ */
+struct target_aarch64_ctx {
+uint32_t magic;
+uint32_t size;
+};
+
+#define TARGET_FPSIMD_MAGIC 0x46508001
+
+struct target_fpsimd_context {
+struct target_aarch64_ctx head;
+uint32_t fpsr;
+uint32_t fpcr;
+uint64_t vregs[32 * 2]; /* really uint128_t vregs[32] */
+};
+
+/*
+ * Auxiliary context saved in the sigcontext.__reserved array. Not exported to
+ * user space as it will change with the addition of new context. User space
+ * should check the magic/size information.
+ */
+struct target_aux_context {
+struct target_fpsimd_context fpsimd;
+/* additional context to be added before "end" */
+struct target_aarch64_ctx end;
+};
+
+struct target_rt_sigframe {
+struct target_siginfo info;
+struct target_ucontext uc;
+uint64_t fp;
+uint64_t lr;
+uint32_t tramp[2];
+};
+
+static int target_setup_sigframe(struct target_rt_sigframe *sf,
+ CPUARMState *env, target_sigset_t *set)
+{
+int i;
+struct target_aux_context *aux =
+(struct target_aux_context *)sf->uc.tuc_mcontext.__reserved;
+
+/* set up the stack frame for unwinding */
+__put_user(env->xregs[29], &sf->fp);
+__put_user(env->xregs[30], &sf->lr);
+
+for (i = 0; i < 31; i++) {
+__put_user(env->xregs[i], &sf->uc.tuc_mcontext.regs[i]);
+}
+__put_user(env->xregs[31], &sf->uc.tuc_mcontext.sp);
+__put_user(env->pc, &sf->uc.tuc_mcontext.pc);
+__put_user(env->pstate, &sf->uc.tuc_mcontext.pstate);
+
+__put_user(/*current->thread.fault_address*/ 0,
+&sf->uc.tuc_mcontext.fault_address);
+
+for (i = 0; i < TARGET_NSIG_WORDS; i++) {
+__put_user(set->sig[i], &sf->uc.tuc_sigmask.sig[i]);
+}
+
+for (i = 0; i < 32; i++) {
+#ifdef TARGET_WORDS_BIGENDIAN
+__put_user(env->vfp.regs[i * 2], &aux->fpsimd.vregs[i * 2 + 1]);
+__put_user(env->vfp.regs[i * 2 + 1], &aux->fpsimd.vregs[i * 2]);
+#else
+__put_user(env->vfp.regs[i * 2], &aux->fpsimd.vregs[i * 2]);
+__put_user(env->vfp.regs[i * 2 + 1], &aux->fpsimd.vregs[i * 2 + 1]);
+#endif
+}
+__put_user(/*env->fpsr*/0, &aux->fpsimd.fpsr);
+__put_user(/*env->fpcr*/0, &aux->fpsimd.fpcr);
+__put_user(TARGET_FPSIMD_MAGIC, &aux->fpsimd.head.magic);
+__put_user(sizeof(struct target_fpsimd_context),
+&aux->fpsimd.head.size);
+
+/* set the "end" magic */
+__put_user(0, &aux->end.magic);
+__put_user(0, &aux->e

[Qemu-devel] [PATCH v6 02/24] target-arm: Abstract out load/store from a vaddr in AArch32

2013-09-03 Thread Peter Maydell

AArch32 code (ie traditional 32 bit world) expects to be
able to pass a vaddr in a TCGv_i32. However when QEMU is
compiled with TARGET_LONG_BITS=32 the TCG load/store
functions take a TCGv_i64. Abstract out load/store with
a 32 bit vaddr so we have a place to put the zero extension
of the vaddr and the extension/truncation of the data value.

Apart from the function definitions most of this patch is
a simple s/tcg_gen_qemu_/gen_aa32_/.

Signed-off-by: Peter Maydell 
---
 target-arm/translate.c |  334 ++--
 1 file changed, 210 insertions(+), 124 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 9160ced..8e58eb1 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -842,6 +842,90 @@ static inline void store_reg_from_load(CPUARMState *env, 
DisasContext *s,
 }
 }
 
+/* Abstractions of "generate code to do a guest load/store for
+ * AArch32", where a vaddr is always 32 bits (and is zero
+ * extended if we're a 64 bit core) and  data is also
+ * 32 bits unless specifically doing a 64 bit access.
+ * These functions work like tcg_gen_qemu_{ld,st}* except
+ * that their arguments are TCGv_i32 rather than TCGv.
+ */
+#if TARGET_LONG_BITS == 32
+
+#define DO_GEN_LD(OP)\
+static inline void gen_aa32_##OP(TCGv_i32 val, TCGv_i32 addr, int index) \
+{\
+tcg_gen_qemu_##OP(val, addr, index); \
+}
+
+#define DO_GEN_ST(OP)\
+static inline void gen_aa32_##OP(TCGv_i32 val, TCGv_i32 addr, int index) \
+{\
+tcg_gen_qemu_##OP(val, addr, index); \
+}
+
+static inline void gen_aa32_ld64(TCGv_i64 val, TCGv_i32 addr, int index)
+{
+tcg_gen_qemu_ld64(val, addr, index);
+}
+
+static inline void gen_aa32_st64(TCGv_i64 val, TCGv_i32 addr, int index)
+{
+tcg_gen_qemu_st64(val, addr, index);
+}
+
+#else
+
+#define DO_GEN_LD(OP)\
+static inline void gen_aa32_##OP(TCGv_i32 val, TCGv_i32 addr, int index) \
+{\
+TCGv addr64 = tcg_temp_new();\
+TCGv val64 = tcg_temp_new(); \
+tcg_gen_extu_i32_i64(addr64, addr);  \
+tcg_gen_qemu_##OP(val64, addr64, index); \
+tcg_temp_free(addr64);   \
+tcg_gen_trunc_i64_i32(val, val64);   \
+tcg_temp_free(val64);\
+}
+
+#define DO_GEN_ST(OP)\
+static inline void gen_aa32_##OP(TCGv_i32 val, TCGv_i32 addr, int index) \
+{\
+TCGv addr64 = tcg_temp_new();\
+TCGv val64 = tcg_temp_new(); \
+tcg_gen_extu_i32_i64(addr64, addr);  \
+tcg_gen_extu_i32_i64(val64, val);\
+tcg_gen_qemu_##OP(val64, addr64, index); \
+tcg_temp_free(addr64);   \
+tcg_temp_free(val64);\
+}
+
+static inline void gen_aa32_ld64(TCGv_i64 val, TCGv_i32 addr, int index)
+{
+TCGv addr64 = tcg_temp_new();
+tcg_gen_extu_i32_i64(addr64, addr);
+tcg_gen_qemu_ld64(val, addr64, index);
+tcg_temp_free(addr64);
+}
+
+static inline void gen_aa32_st64(TCGv_i64 val, TCGv_i32 addr, int index)
+{
+TCGv addr64 = tcg_temp_new();
+tcg_gen_extu_i32_i64(addr64, addr);
+tcg_gen_qemu_st64(val, addr64, index);
+tcg_temp_free(addr64);
+}
+
+#endif
+
+DO_GEN_LD(ld8s)
+DO_GEN_LD(ld8u)
+DO_GEN_LD(ld16s)
+DO_GEN_LD(ld16u)
+DO_GEN_LD(ld32u)
+DO_GEN_ST(st8)
+DO_GEN_ST(st16)
+DO_GEN_ST(st32)
+
 static inline void gen_set_pc_im(uint32_t val)
 {
 tcg_gen_movi_i32(cpu_R[15], val);
@@ -1071,18 +1155,20 @@ VFP_GEN_FIX(ulto)
 
 static inline void gen_vfp_ld(DisasContext *s, int dp, TCGv_i32 addr)
 {
-if (dp)
-tcg_gen_qemu_ld64(cpu_F0d, addr, IS_USER(s));
-else
-tcg_gen_qemu_ld32u(cpu_F0s, addr, IS_USER(s));
+if (dp) {
+gen_aa32_ld64(cpu_F0d, addr, IS_USER(s));
+} else {
+gen_aa32_ld32u(cpu_F0s, addr, IS_USER(s));
+}
 }
 
 static inline void gen_vfp_st(DisasContext *s, int dp, TCGv_i32 addr)
 {
-if (dp)
-tcg_gen_qemu_st64(cpu_F0d, addr, IS_USER(s));
-else
-tcg_gen_qemu_st32(cpu_F0s, addr, IS_USER(s));
+if (dp) {
+gen_aa32_st64(cpu_

[Qemu-devel] [PATCH v6 20/24] linux-user: Allow targets to specify a minimum uname release

2013-09-03 Thread Peter Maydell

For newer target architectures, glibc can be picky about the kernel
version: for example, it will not run on an aarch64 system unless
the kernel reports itself as at least 3.8.0. Accommodate this by
enhancing the existing support for faking the kernel version so
that each target can optionally specify a minimum version: if
the user doesn't force a specific fake version then we will override
with the minimum required version only if the real host kernel
version is insufficient.

Use this facility to let aarch64 report a minimum of 3.8.0.

Signed-off-by: Peter Maydell 
---
 linux-user/main.c|2 ++
 linux-user/qemu.h|1 +
 linux-user/syscall.c |   62 ++
 3 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 28cc58a..d530f01 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -3672,6 +3672,8 @@ int main(int argc, char **argv, char **envp)
 /* Scan interp_prefix dir for replacement files. */
 init_paths(interp_prefix);
 
+init_qemu_uname_release();
+
 if (cpu_model == NULL) {
 #if defined(TARGET_I386)
 #ifdef TARGET_X86_64
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 4df4fcb..6ffe5a2 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -197,6 +197,7 @@ extern THREAD CPUState *thread_cpu;
 void cpu_loop(CPUArchState *env);
 char *target_strerror(int err);
 int get_osversion(void);
+void init_qemu_uname_release(void);
 void fork_start(void);
 void fork_end(int child);
 
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 73046b0..180463d 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -4863,12 +4863,35 @@ int host_to_target_waitstatus(int status)
 return status;
 }
 
+static int relstr_to_int(const char *s)
+{
+/* Convert a uname release string like "2.6.18" to an integer
+ * of the form 0x020612. (Beware that 0x020612 is *not* 2.6.12.)
+ */
+int i, n, tmp;
+
+tmp = 0;
+for (i = 0; i < 3; i++) {
+n = 0;
+while (*s >= '0' && *s <= '9') {
+n *= 10;
+n += *s - '0';
+s++;
+}
+tmp = (tmp << 8) + n;
+if (*s == '.') {
+s++;
+}
+}
+return tmp;
+}
+
 int get_osversion(void)
 {
 static int osversion;
 struct new_utsname buf;
 const char *s;
-int i, n, tmp;
+
 if (osversion)
 return osversion;
 if (qemu_uname_release && *qemu_uname_release) {
@@ -4878,22 +4901,33 @@ int get_osversion(void)
 return 0;
 s = buf.release;
 }
-tmp = 0;
-for (i = 0; i < 3; i++) {
-n = 0;
-while (*s >= '0' && *s <= '9') {
-n *= 10;
-n += *s - '0';
-s++;
-}
-tmp = (tmp << 8) + n;
-if (*s == '.')
-s++;
-}
-osversion = tmp;
+osversion = relstr_to_int(s);
 return osversion;
 }
 
+void init_qemu_uname_release(void)
+{
+/* Initialize qemu_uname_release for later use.
+ * If the host kernel is too old and the user hasn't asked for
+ * a specific fake version number, we might want to fake a minimum
+ * target kernel version.
+ */
+#ifdef UNAME_MINIMUM_RELEASE
+struct new_utsname buf;
+
+if (qemu_uname_release && *qemu_uname_release) {
+return;
+}
+
+if (sys_uname(&buf)) {
+return;
+}
+
+if (relstr_to_int(buf.release) < relstr_to_int(UNAME_MINIMUM_RELEASE)) {
+qemu_uname_release = UNAME_MINIMUM_RELEASE;
+}
+#endif
+}
 
 static int open_self_maps(void *cpu_env, int fd)
 {
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 19/24] linux-user: Add AArch64 termbits.h definitions

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

Add the AArch64 termbits.h with all the target's termios related
constants and structures.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
[PMM: split out from another patch]
Signed-off-by: Peter Maydell 
---
 linux-user/aarch64/termbits.h |  220 +
 1 file changed, 220 insertions(+)
 create mode 100644 linux-user/aarch64/termbits.h

diff --git a/linux-user/aarch64/termbits.h b/linux-user/aarch64/termbits.h
new file mode 100644
index 000..b64ba97
--- /dev/null
+++ b/linux-user/aarch64/termbits.h
@@ -0,0 +1,220 @@
+/* from asm/termbits.h */
+/* NOTE: exactly the same as i386 */
+
+#define TARGET_NCCS 19
+
+struct target_termios {
+unsigned int c_iflag;   /* input mode flags */
+unsigned int c_oflag;   /* output mode flags */
+unsigned int c_cflag;   /* control mode flags */
+unsigned int c_lflag;   /* local mode flags */
+unsigned char c_line;/* line discipline */
+unsigned char c_cc[TARGET_NCCS];/* control characters */
+};
+
+/* c_iflag bits */
+#define TARGET_IGNBRK  001
+#define TARGET_BRKINT  002
+#define TARGET_IGNPAR  004
+#define TARGET_PARMRK  010
+#define TARGET_INPCK   020
+#define TARGET_ISTRIP  040
+#define TARGET_INLCR   100
+#define TARGET_IGNCR   200
+#define TARGET_ICRNL   400
+#define TARGET_IUCLC   0001000
+#define TARGET_IXON0002000
+#define TARGET_IXANY   0004000
+#define TARGET_IXOFF   001
+#define TARGET_IMAXBEL 002
+#define TARGET_IUTF8   004
+
+/* c_oflag bits */
+#define TARGET_OPOST   001
+#define TARGET_OLCUC   002
+#define TARGET_ONLCR   004
+#define TARGET_OCRNL   010
+#define TARGET_ONOCR   020
+#define TARGET_ONLRET  040
+#define TARGET_OFILL   100
+#define TARGET_OFDEL   200
+#define TARGET_NLDLY   400
+#define   TARGET_NL0   000
+#define   TARGET_NL1   400
+#define TARGET_CRDLY   0003000
+#define   TARGET_CR0   000
+#define   TARGET_CR1   0001000
+#define   TARGET_CR2   0002000
+#define   TARGET_CR3   0003000
+#define TARGET_TABDLY  0014000
+#define   TARGET_TAB0  000
+#define   TARGET_TAB1  0004000
+#define   TARGET_TAB2  001
+#define   TARGET_TAB3  0014000
+#define   TARGET_XTABS 0014000
+#define TARGET_BSDLY   002
+#define   TARGET_BS0   000
+#define   TARGET_BS1   002
+#define TARGET_VTDLY   004
+#define   TARGET_VT0   000
+#define   TARGET_VT1   004
+#define TARGET_FFDLY   010
+#define   TARGET_FF0   000
+#define   TARGET_FF1   010
+
+/* c_cflag bit meaning */
+#define TARGET_CBAUD   0010017
+#define  TARGET_B0 000 /* hang up */
+#define  TARGET_B50001
+#define  TARGET_B75002
+#define  TARGET_B110   003
+#define  TARGET_B134   004
+#define  TARGET_B150   005
+#define  TARGET_B200   006
+#define  TARGET_B300   007
+#define  TARGET_B600   010
+#define  TARGET_B1200  011
+#define  TARGET_B1800  012
+#define  TARGET_B2400  013
+#define  TARGET_B4800  014
+#define  TARGET_B9600  015
+#define  TARGET_B19200 016
+#define  TARGET_B38400 017
+#define TARGET_EXTA B19200
+#define TARGET_EXTB B38400
+#define TARGET_CSIZE   060
+#define   TARGET_CS5   000
+#define   TARGET_CS6   020
+#define   TARGET_CS7   040
+#define   TARGET_CS8   060
+#define TARGET_CSTOPB  100
+#define TARGET_CREAD   200
+#define TARGET_PARENB  400
+#define TARGET_PARODD  0001000
+#define TARGET_HUPCL   0002000
+#define TARGET_CLOCAL  0004000
+#define TARGET_CBAUDEX 001
+#define  TARGET_B57600  0010001
+#define  TARGET_B115200 0010002
+#define  TARGET_B230400 0010003
+#define  TARGET_B460800 0010004
+#define TARGET_CIBAUD00200360  /* input baud rate (not used) */
+#define TARGET_CMSPAR0100  /* mark or space (stick) parity */
+#define TARGET_CRTSCTS   0200  /* flow control */
+
+/* c_lflag bits */
+#define TARGET_ISIG001
+#define TARGET_ICANON  002
+#define TARGET_XCASE   004
+#define TARGET_ECHO010
+#define TARGET_ECHOE   020
+#define TARGET_ECHOK   040
+#define TARGET_ECHONL  100
+#define TARGET_NOFLSH  200
+#define TARGET_TOSTOP  400
+#define TARGET_ECHOCTL 0001000
+#define TARGET_ECHOPRT 0002000
+#define TARGET_ECHOKE  0004000
+#define TARGET_FLUSHO  001
+#define TARGET_PENDIN  004
+#define TARGET_IEXTEN  010
+
+/* c_cc character offsets */
+#define TARGET_VINTR0
+#define TARGET_VQUIT1
+#define TARGET_VERASE   2
+#define TARGET_VKILL3
+#define TARGET_VEOF 4
+#define TARGET_VTIME5
+#define TARGET_VMIN 6
+#define TARGET_VSWTC7
+#define TARGET_VSTART   8
+#define TARGET_VSTOP9
+#define TARGET_VSUSP10
+#define TARGET_VEOL 11
+#define TARGET_VREPRINT 12
+#define TARGET_VDISCARD 13
+#define TARGET_VWERASE  14
+#define TARGET_VLNEXT   15
+#define TARGE

[Qemu-devel] [PATCH v6 09/24] target-arm: Prepare translation for AArch64 code

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

This patch adds all the prerequisites for AArch64 support that didn't
fit into split up patches. It extends important bits in the core cpu
headers to also take AArch64 mode into account.

Add new ARM_TBFLAG_AARCH64_STATE translation buffer flag
indicate an ARMv8 cpu running in aarch64 mode vs aarch32 mode.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-4-git-send-email-john.ri...@linaro.org
[PMM:
 * rearranged tbflags so AArch64? is bit 31 and if it is set then
  30..0 are freely available for whatever makes most sense for that mode
 * added version bump since we change VFP migration state
 * added a comment about how VFP/Neon register state works
 * physical address space is 48 bits, not 64
 * added ARM_FEATURE_AARCH64 flag to identify 64-bit capable CPUs
]
Signed-off-by: Peter Maydell 
---
 target-arm/cpu.c   |8 +++
 target-arm/cpu.h   |  134 +++-
 target-arm/machine.c   |8 +--
 target-arm/translate.c |   38 ++
 target-arm/translate.h |1 +
 5 files changed, 151 insertions(+), 38 deletions(-)

diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 67d285d..ea142e5 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -84,6 +84,11 @@ static void arm_cpu_reset(CPUState *s)
 env->iwmmxt.cregs[ARM_IWMMXT_wCID] = 0x69051000 | 'Q';
 }
 
+if (arm_feature(env, ARM_FEATURE_AARCH64)) {
+/* 64 bit CPUs always start in 64 bit mode */
+env->aarch64 = 1;
+}
+
 #if defined(CONFIG_USER_ONLY)
 env->uncached_cpsr = ARM_CPU_MODE_USR;
 /* For user mode we must enable access to coprocessors */
@@ -834,6 +839,9 @@ static void arm_any_initfn(Object *obj)
 set_feature(&cpu->env, ARM_FEATURE_THUMB2EE);
 set_feature(&cpu->env, ARM_FEATURE_ARM_DIV);
 set_feature(&cpu->env, ARM_FEATURE_V7MP);
+#ifdef TARGET_AARCH64
+set_feature(&cpu->env, ARM_FEATURE_AARCH64);
+#endif
 cpu->midr = 0x;
 }
 #endif
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 8d1cc47..ce835ef 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -19,13 +19,19 @@
 #ifndef CPU_ARM_H
 #define CPU_ARM_H
 
-#define TARGET_LONG_BITS 32
+#include "config.h"
 
-#define ELF_MACHINEEM_ARM
+#if defined(TARGET_AARCH64)
+  /* AArch64 definitions */
+#  define TARGET_LONG_BITS 64
+#  define ELF_MACHINE EM_AARCH64
+#else
+#  define TARGET_LONG_BITS 32
+#  define ELF_MACHINE EM_ARM
+#endif
 
 #define CPUArchState struct CPUARMState
 
-#include "config.h"
 #include "qemu-common.h"
 #include "exec/cpu-defs.h"
 
@@ -97,6 +103,20 @@ typedef struct ARMGenericTimer {
 typedef struct CPUARMState {
 /* Regs for current mode.  */
 uint32_t regs[16];
+
+/* 32/64 switch only happens when taking and returning from
+ * exceptions so the overlap semantics are taken care of then
+ * instead of having a complicated union.
+ */
+/* Regs for A64 mode.  */
+uint64_t xregs[32];
+uint64_t pc;
+/* TODO: pstate doesn't correspond to an architectural register;
+ * it would be better modelled as the underlying fields.
+ */
+uint32_t pstate;
+uint32_t aarch64; /* 1 if CPU is in aarch64 state; inverse of PSTATE.nRW */
+
 /* Frequently accessed CPSR bits are stored separately for efficiency.
This contains all the other bits.  Use cpsr_{read,write} to access
the whole CPSR.  */
@@ -175,6 +195,11 @@ typedef struct CPUARMState {
 uint32_t c15_power_control; /* power control */
 } cp15;
 
+/* System registers (AArch64) */
+struct {
+uint64_t tpidr_el0;
+} sr;
+
 struct {
 uint32_t other_sp;
 uint32_t vecbase;
@@ -191,7 +216,22 @@ typedef struct CPUARMState {
 
 /* VFP coprocessor state.  */
 struct {
-float64 regs[32];
+/* VFP/Neon register state. Note that the mapping between S, D and Q
+ * views of the register bank differs between AArch64 and AArch32:
+ * In AArch32:
+ *  Qn = regs[2n+1]:regs[2n]
+ *  Dn = regs[n]
+ *  Sn = regs[n/2] bits 31..0 for even n, and bits 63..32 for odd n
+ * (and regs[32] to regs[63] are inaccessible)
+ * In AArch64:
+ *  Qn = regs[2n+1]:regs[2n]
+ *  Dn = regs[2n]
+ *  Sn = regs[2n] bits 31..0
+ * This corresponds to the architecturally defined mapping between
+ * the two execution states, and means we do not need to explicitly
+ * map these registers when changing states.
+ */
+float64 regs[64];
 
 uint32_t xregs[16];
 /* We store these fpcsr fields separately for convenience.  */
@@ -261,6 +301,20 @@ int bank_number(int mode);
 void switch_mode(CPUARMState *, int);
 uint32_t do_arm_semihosting(CPUARMState *env);
 
+static inline bool is_a64(CPUARMState *env)
+{
+return env->aarch64;
+}
+
+#define PSTATE_N_SHIFT 3
+#define PSTATE_N  (1 << PSTATE_N_SHIFT)
+#define PSTATE_Z

[Qemu-devel] [PATCH v6 10/24] target-arm: Add AArch64 translation stub

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

We should translate AArch64 mode separately from AArch32 mode. In AArch64 mode,
registers look vastly different, instruction encoding is completely different,
basically the system turns into a different machine.

So let's do a simple if() in translate.c to decide whether we can handle the
current code in the legacy AArch32 code or in the new AArch64 code.

So far, the translation always complains about unallocated instructions. There
is no emulator functionality in this patch!

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-5-git-send-email-john.ri...@linaro.org
[PMM:
 * provide no-op versions of a64 functions ifndef TARGET_AARCH64;
   this lets us avoid #ifdefs in translate.c
 * insert the missing call to disas_a64_insn()
 * stash the insn in the DisasContext rather than reloading it in
   real_unallocated_encoding()
]
Signed-off-by: Peter Maydell 
---
 target-arm/Makefile.objs   |2 +-
 target-arm/cpu-qom.h   |5 ++
 target-arm/cpu64.c |3 +
 target-arm/translate-a64.c |  139 
 target-arm/translate.c |   14 -
 target-arm/translate.h |   19 ++
 6 files changed, 178 insertions(+), 4 deletions(-)
 create mode 100644 target-arm/translate-a64.c

diff --git a/target-arm/Makefile.objs b/target-arm/Makefile.objs
index baebc50..a11d76e 100644
--- a/target-arm/Makefile.objs
+++ b/target-arm/Makefile.objs
@@ -5,4 +5,4 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-y += translate.o op_helper.o helper.o cpu.o
 obj-y += neon_helper.o iwmmxt_helper.o
 obj-y += gdbstub.o
-obj-$(TARGET_AARCH64) += cpu64.o
+obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o
diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index fbe846e..6502a7b 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -173,4 +173,9 @@ int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, 
int reg);
 void arm_gt_ptimer_cb(void *opaque);
 void arm_gt_vtimer_cb(void *opaque);
 
+#ifdef TARGET_AARCH64
+void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
+fprintf_function cpu_fprintf, int flags);
+#endif
+
 #endif
diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
index faee0f0..4428f6c 100644
--- a/target-arm/cpu64.c
+++ b/target-arm/cpu64.c
@@ -70,6 +70,9 @@ static void aarch64_cpu_finalizefn(Object *obj)
 
 static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
 {
+CPUClass *cc = CPU_CLASS(oc);
+
+cc->dump_state = aarch64_cpu_dump_state;
 }
 
 static void aarch64_cpu_register(const ARMCPUInfo *info)
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
new file mode 100644
index 000..f120088
--- /dev/null
+++ b/target-arm/translate-a64.c
@@ -0,0 +1,139 @@
+/*
+ *  AArch64 translation
+ *
+ *  Copyright (c) 2013 Alexander Graf 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "cpu.h"
+#include "tcg-op.h"
+#include "qemu/log.h"
+#include "translate.h"
+#include "qemu/host-utils.h"
+
+#include "helper.h"
+#define GEN_HELPER 1
+#include "helper.h"
+
+static TCGv_i64 cpu_X[32];
+static TCGv_i64 cpu_pc;
+static TCGv_i32 pstate;
+
+static const char *regnames[] = {
+"x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7",
+"x8", "x9", "x10", "x11", "x12", "x13", "x14", "x15",
+"x16", "x17", "x18", "x19", "x20", "x21", "x22", "x23",
+"x24", "x25", "x26", "x27", "x28", "x29", "lr", "sp"
+};
+
+/* initialize TCG globals.  */
+void a64_translate_init(void)
+{
+int i;
+
+cpu_pc = tcg_global_mem_new_i64(TCG_AREG0,
+offsetof(CPUARMState, pc),
+"pc");
+for (i = 0; i < 32; i++) {
+cpu_X[i] = tcg_global_mem_new_i64(TCG_AREG0,
+  offsetof(CPUARMState, xregs[i]),
+  regnames[i]);
+}
+
+pstate = tcg_global_mem_new_i32(TCG_AREG0,
+offsetof(CPUARMState, pstate),
+"pstate");
+}
+
+void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
+fprintf_function cpu_fprintf, int flags)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = &cpu->env;
+int i;
+
+cpu_fprintf(f, "PC=%016"PRIx6

[Qemu-devel] [PATCH v6 18/24] linux-user: Implement cpu_set_tls() and cpu_clone_regs() for AArch64

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
[PMM: pulled out from another patch; don't use is_a64() here;
 moved to linux-user from target-arm]
Signed-off-by: Peter Maydell 
---
 linux-user/aarch64/target_cpu.h |   35 +++
 1 file changed, 35 insertions(+)
 create mode 100644 linux-user/aarch64/target_cpu.h

diff --git a/linux-user/aarch64/target_cpu.h b/linux-user/aarch64/target_cpu.h
new file mode 100644
index 000..6f5539b
--- /dev/null
+++ b/linux-user/aarch64/target_cpu.h
@@ -0,0 +1,35 @@
+/*
+ * ARM AArch64 specific CPU ABI and functions for linux-user
+ *
+ * Copyright (c) 2013 Alexander Graf 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+#ifndef TARGET_CPU_H
+#define TARGET_CPU_H
+
+static inline void cpu_clone_regs(CPUARMState *env, target_ulong newsp)
+{
+if (newsp) {
+env->xregs[31] = newsp;
+}
+env->xregs[0] = 0;
+}
+
+static inline void cpu_set_tls(CPUARMState *env, target_ulong newtls)
+{
+env->sr.tpidr_el0 = newtls;
+}
+
+#endif
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 13/24] linux-user: Add cpu loop for AArch64

2013-09-03 Thread Peter Maydell

Add the main linux-user cpu loop for AArch64. Since AArch64
has a different system call interface, doesn't need to worry
about FPA emulation and may in the future keep the prefetch/data
abort information in different system registers, it's simplest
just to use a completely separate loop from the 32 bit ARM
target, rather than peppering it with ifdefs.

Signed-off-by: Peter Maydell 
---
 linux-user/main.c |   82 +
 1 file changed, 82 insertions(+)

diff --git a/linux-user/main.c b/linux-user/main.c
index 03859bc..28cc58a 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -445,6 +445,9 @@ void cpu_loop(CPUX86State *env)
 __r;\
 })
 
+#ifdef TARGET_ABI32
+/* Commpage handling -- there is no commpage for AArch64 */
+
 /*
  * See the Linux kernel's Documentation/arm/kernel_user_helpers.txt
  * Input:
@@ -578,6 +581,7 @@ do_kernel_trap(CPUARMState *env)
 
 return 0;
 }
+#endif
 
 static int do_strex(CPUARMState *env)
 {
@@ -657,6 +661,7 @@ done:
 return segv;
 }
 
+#ifdef TARGET_ABI32
 void cpu_loop(CPUARMState *env)
 {
 CPUState *cs = CPU(arm_env_get_cpu(env));
@@ -869,6 +874,83 @@ void cpu_loop(CPUARMState *env)
 }
 }
 
+#else
+
+/* AArch64 main loop */
+void cpu_loop(CPUARMState *env)
+{
+CPUState *cs = CPU(arm_env_get_cpu(env));
+int trapnr, sig;
+target_siginfo_t info;
+uint32_t addr;
+
+for (;;) {
+cpu_exec_start(cs);
+trapnr = cpu_arm_exec(env);
+cpu_exec_end(cs);
+
+switch (trapnr) {
+case EXCP_SWI:
+env->xregs[0] = do_syscall(env,
+   env->xregs[8],
+   env->xregs[0],
+   env->xregs[1],
+   env->xregs[2],
+   env->xregs[3],
+   env->xregs[4],
+   env->xregs[5],
+   0, 0);
+break;
+case EXCP_INTERRUPT:
+/* just indicate that signals should be handled asap */
+break;
+case EXCP_UDEF:
+info.si_signo = SIGILL;
+info.si_errno = 0;
+info.si_code = TARGET_ILL_ILLOPN;
+info._sifields._sigfault._addr = env->pc;
+queue_signal(env, info.si_signo, &info);
+break;
+case EXCP_PREFETCH_ABORT:
+addr = env->cp15.c6_insn;
+goto do_segv;
+case EXCP_DATA_ABORT:
+addr = env->cp15.c6_data;
+do_segv:
+info.si_signo = SIGSEGV;
+info.si_errno = 0;
+/* XXX: check env->error_code */
+info.si_code = TARGET_SEGV_MAPERR;
+info._sifields._sigfault._addr = addr;
+queue_signal(env, info.si_signo, &info);
+break;
+case EXCP_DEBUG:
+case EXCP_BKPT:
+sig = gdb_handlesig(cs, TARGET_SIGTRAP);
+if (sig) {
+info.si_signo = sig;
+info.si_errno = 0;
+info.si_code = TARGET_TRAP_BRKPT;
+queue_signal(env, info.si_signo, &info);
+}
+break;
+case EXCP_STREX:
+if (do_strex(env)) {
+addr = env->cp15.c6_data;
+goto do_segv;
+}
+break;
+default:
+fprintf(stderr, "qemu: unhandled CPU exception 0x%x - aborting\n",
+trapnr);
+cpu_dump_state(cs, stderr, fprintf, 0);
+abort();
+}
+process_pending_signals(env);
+}
+}
+#endif /* ndef TARGET_ABI32 */
+
 #endif
 
 #ifdef TARGET_UNICORE32
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 11/24] target-arm: Add AArch64 gdbstub support

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

We want to be able to debug AArch64 guests. So let's add the respective gdb
stub functions and xml descriptions that allow us to do so.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-6-git-send-email-john.ri...@linaro.org
[PMM: dropped unused fp regs XML for now; moved 64 bit only functions
 to new gdbstub64.c; these are hooked up in AArch64CPU, not via
 ifdefs in ARMCPU]
Signed-off-by: Peter Maydell 
---
 gdb-xml/aarch64-core.xml |   46 +
 target-arm/Makefile.objs |2 +-
 target-arm/cpu-qom.h |2 ++
 target-arm/cpu64.c   |4 +++
 target-arm/gdbstub64.c   |   73 ++
 5 files changed, 126 insertions(+), 1 deletion(-)
 create mode 100644 gdb-xml/aarch64-core.xml
 create mode 100644 target-arm/gdbstub64.c

diff --git a/gdb-xml/aarch64-core.xml b/gdb-xml/aarch64-core.xml
new file mode 100644
index 000..e1e9dc3
--- /dev/null
+++ b/gdb-xml/aarch64-core.xml
@@ -0,0 +1,46 @@
+
+
+
+
+
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+
+  
+  
+
diff --git a/target-arm/Makefile.objs b/target-arm/Makefile.objs
index a11d76e..6453f5c 100644
--- a/target-arm/Makefile.objs
+++ b/target-arm/Makefile.objs
@@ -5,4 +5,4 @@ obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-y += translate.o op_helper.o helper.o cpu.o
 obj-y += neon_helper.o iwmmxt_helper.o
 obj-y += gdbstub.o
-obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o
+obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o gdbstub64.o
diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index 6502a7b..b55306a 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -176,6 +176,8 @@ void arm_gt_vtimer_cb(void *opaque);
 #ifdef TARGET_AARCH64
 void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
 fprintf_function cpu_fprintf, int flags);
+int aarch64_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
+int aarch64_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 #endif
 
 #endif
diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
index 4428f6c..3e99c21 100644
--- a/target-arm/cpu64.c
+++ b/target-arm/cpu64.c
@@ -73,6 +73,10 @@ static void aarch64_cpu_class_init(ObjectClass *oc, void 
*data)
 CPUClass *cc = CPU_CLASS(oc);
 
 cc->dump_state = aarch64_cpu_dump_state;
+cc->gdb_read_register = aarch64_cpu_gdb_read_register;
+cc->gdb_write_register = aarch64_cpu_gdb_write_register;
+cc->gdb_num_core_regs = 34;
+cc->gdb_core_xml_file = "aarch64-core.xml";
 }
 
 static void aarch64_cpu_register(const ARMCPUInfo *info)
diff --git a/target-arm/gdbstub64.c b/target-arm/gdbstub64.c
new file mode 100644
index 000..7cb6a7c
--- /dev/null
+++ b/target-arm/gdbstub64.c
@@ -0,0 +1,73 @@
+/*
+ * ARM gdb server stub: AArch64 specific functions.
+ *
+ * Copyright (c) 2013 SUSE LINUX Products GmbH
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+#include "config.h"
+#include "qemu-common.h"
+#include "exec/gdbstub.h"
+
+int aarch64_cpu_gdb_read_register(CPUState *cs, uint8_t *mem_buf, int n)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = &cpu->env;
+
+if (n < 31) {
+/* Core integer register.  */
+return gdb_get_reg64(mem_buf, env->xregs[n]);
+}
+switch (n) {
+case 31:
+return gdb_get_reg64(mem_buf, env->xregs[31]);
+break;
+case 32:
+return gdb_get_reg64(mem_buf, env->pc);
+break;
+case 33:
+return gdb_get_reg32(mem_buf, env->pstate);
+}
+/* Unknown register.  */
+return 0;
+}
+
+int aarch64_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = &cpu->env;
+uint64_t tmp;
+
+tmp = ldq_p(mem_buf);
+
+if (n < 31) {
+/* Core integer register.  */
+env->xregs[n] = tmp;
+return 8;
+}
+switch (n) {
+case 31:
+env->xregs[31] = tmp;
+return 8;
+case 32:
+env->pc = tmp;
+return 8;
+case 33:
+/* CPSR */
+env->pstate = tmp;
+return 4;
+}
+/* Unknown register.  */
+return 0;
+}
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 04/24] target-arm: Export cpu_env

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

The cpu_env tcg variable will be used by both the AArch32 and AArch64
handling code. Unstaticify it, so that both sides can make use of it.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-3-git-send-email-john.ri...@linaro.org
Signed-off-by: Peter Maydell 
---
 target-arm/translate.c |2 +-
 target-arm/translate.h |2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 1bb6f46..a6adcc8 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -60,7 +60,7 @@ static uint32_t gen_opc_condexec_bits[OPC_BUF_SIZE];
 #define DISAS_WFI 4
 #define DISAS_SWI 5
 
-static TCGv_ptr cpu_env;
+TCGv_ptr cpu_env;
 /* We reuse the same 64-bit temporaries for efficiency.  */
 static TCGv_i64 cpu_V0, cpu_V1, cpu_M0;
 static TCGv_i32 cpu_R[16];
diff --git a/target-arm/translate.h b/target-arm/translate.h
index e727bc6..8ba1433 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -24,4 +24,6 @@ typedef struct DisasContext {
 int vec_stride;
 } DisasContext;
 
+extern TCGv_ptr cpu_env;
+
 #endif /* TARGET_ARM_TRANSLATE_H */
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 17/24] linux-user: Make sure NWFPE code is 32 bit ARM only

2013-09-03 Thread Peter Maydell

On ARM, linux-user emulation includes NWFPE support for emulating the
ancient FPA floating point coprocessor. This has long since been
superseded by VFP and is only required for legacy binaries. The
AArch64 linux-user target doesn't compile in NWFPE support, so make
sure the relevant code is protected by suitable ifdefs.

Signed-off-by: Peter Maydell 
---
 linux-user/qemu.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 4a16e8f..4df4fcb 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -74,7 +74,7 @@ struct vm86_saved_state {
 };
 #endif
 
-#ifdef TARGET_ARM
+#if defined(TARGET_ARM) && defined(TARGET_ABI32)
 /* FPU emulator */
 #include "nwfpe/fpa11.h"
 #endif
@@ -98,8 +98,10 @@ struct emulated_sigtable {
 typedef struct TaskState {
 pid_t ts_tid; /* tid (or pid) of this task */
 #ifdef TARGET_ARM
+# ifdef TARGET_ABI32
 /* FPA state */
 FPA11 fpa;
+# endif
 int swi_errno;
 #endif
 #ifdef TARGET_UNICORE32
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 06/24] target-arm: Pass DisasContext* to gen_set_pc_im()

2013-09-03 Thread Peter Maydell

We want gen_set_pc_im() to work for both AArch64 and AArch32, but
to do this we'll need the DisasContext* so we can tell which mode
we're in, so pass it in as a parameter.

Signed-off-by: Peter Maydell 
---
 target-arm/translate.c |   26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 5a465fc..4477402 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -904,7 +904,7 @@ DO_GEN_ST(st8)
 DO_GEN_ST(st16)
 DO_GEN_ST(st32)
 
-static inline void gen_set_pc_im(target_ulong val)
+static inline void gen_set_pc_im(DisasContext *s, target_ulong val)
 {
 tcg_gen_movi_i32(cpu_R[15], val);
 }
@@ -3419,10 +3419,10 @@ static inline void gen_goto_tb(DisasContext *s, int n, 
target_ulong dest)
 tb = s->tb;
 if ((tb->pc & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK)) {
 tcg_gen_goto_tb(n);
-gen_set_pc_im(dest);
+gen_set_pc_im(s, dest);
 tcg_gen_exit_tb((uintptr_t)tb + n);
 } else {
-gen_set_pc_im(dest);
+gen_set_pc_im(s, dest);
 tcg_gen_exit_tb(0);
 }
 }
@@ -3551,7 +3551,7 @@ gen_set_condexec (DisasContext *s)
 static void gen_exception_insn(DisasContext *s, int offset, int excp)
 {
 gen_set_condexec(s);
-gen_set_pc_im(s->pc - offset);
+gen_set_pc_im(s, s->pc - offset);
 gen_exception(excp);
 s->is_jmp = DISAS_JUMP;
 }
@@ -3560,7 +3560,7 @@ static void gen_nop_hint(DisasContext *s, int val)
 {
 switch (val) {
 case 3: /* wfi */
-gen_set_pc_im(s->pc);
+gen_set_pc_im(s, s->pc);
 s->is_jmp = DISAS_WFI;
 break;
 case 2: /* wfe */
@@ -6337,7 +6337,7 @@ static int disas_coproc_insn(CPUARMState * env, 
DisasContext *s, uint32_t insn)
 if (isread) {
 return 1;
 }
-gen_set_pc_im(s->pc);
+gen_set_pc_im(s, s->pc);
 s->is_jmp = DISAS_WFI;
 return 0;
 default:
@@ -6357,7 +6357,7 @@ static int disas_coproc_insn(CPUARMState * env, 
DisasContext *s, uint32_t insn)
 tmp64 = tcg_const_i64(ri->resetvalue);
 } else if (ri->readfn) {
 TCGv_ptr tmpptr;
-gen_set_pc_im(s->pc);
+gen_set_pc_im(s, s->pc);
 tmp64 = tcg_temp_new_i64();
 tmpptr = tcg_const_ptr(ri);
 gen_helper_get_cp_reg64(tmp64, cpu_env, tmpptr);
@@ -6380,7 +6380,7 @@ static int disas_coproc_insn(CPUARMState * env, 
DisasContext *s, uint32_t insn)
 tmp = tcg_const_i32(ri->resetvalue);
 } else if (ri->readfn) {
 TCGv_ptr tmpptr;
-gen_set_pc_im(s->pc);
+gen_set_pc_im(s, s->pc);
 tmp = tcg_temp_new_i32();
 tmpptr = tcg_const_ptr(ri);
 gen_helper_get_cp_reg(tmp, cpu_env, tmpptr);
@@ -6415,7 +6415,7 @@ static int disas_coproc_insn(CPUARMState * env, 
DisasContext *s, uint32_t insn)
 tcg_temp_free_i32(tmphi);
 if (ri->writefn) {
 TCGv_ptr tmpptr = tcg_const_ptr(ri);
-gen_set_pc_im(s->pc);
+gen_set_pc_im(s, s->pc);
 gen_helper_set_cp_reg64(cpu_env, tmpptr, tmp64);
 tcg_temp_free_ptr(tmpptr);
 } else {
@@ -6426,7 +6426,7 @@ static int disas_coproc_insn(CPUARMState * env, 
DisasContext *s, uint32_t insn)
 if (ri->writefn) {
 TCGv_i32 tmp;
 TCGv_ptr tmpptr;
-gen_set_pc_im(s->pc);
+gen_set_pc_im(s, s->pc);
 tmp = load_reg(s, rt);
 tmpptr = tcg_const_ptr(ri);
 gen_helper_set_cp_reg(cpu_env, tmpptr, tmp);
@@ -8034,7 +8034,7 @@ static void disas_arm_insn(CPUARMState * env, 
DisasContext *s)
 break;
 case 0xf:
 /* swi */
-gen_set_pc_im(s->pc);
+gen_set_pc_im(s, s->pc);
 s->is_jmp = DISAS_SWI;
 break;
 default:
@@ -9935,7 +9935,7 @@ static void disas_thumb_insn(CPUARMState *env, 
DisasContext *s)
 
 if (cond == 0xf) {
 /* swi */
-gen_set_pc_im(s->pc);
+gen_set_pc_im(s, s->pc);
 s->is_jmp = DISAS_SWI;
 break;
 }
@@ -10185,7 +10185,7 @@ static inline void 
gen_intermediate_code_internal(ARMCPU *cpu,
 gen_set_label(dc->condlabel);
 }
 if (dc->condjmp || !dc->is_jmp) {
-gen_set_pc_im(dc->pc);
+gen_set_pc_im(dc, dc->pc);
 dc->condjmp = 0;
 }
 gen_set_condexec(dc);
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 01/24] target-arm: Make '-cpu any' available in linux-user mode only

2013-09-03 Thread Peter Maydell

Make the 'any' CPU for target-arm available only in linux-user mode.
The ARM target provides a CPU named "any", which turns on support for
all user-level instruction set extensions we know about. This is
intended for linux-user emulation mode, where it is the default CPU type.
It makes no sense to try to use this for system emulation, since we don't
initialize it with any system-level information like feature register
values or implementation specific cp15 registers. (Unsurprisingly, some
boards won't boot at all, though you might get lucky in some cases where
the guest doesn't happen to prod things that aren't there.)

Prevent users from making this command line error by removing the
CPU definition from the softmmu build.

Signed-off-by: Peter Maydell 
---
 target-arm/cpu.c |4 
 1 file changed, 4 insertions(+)

diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index b2556c6..827e28e 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -822,6 +822,7 @@ static void pxa270c5_initfn(Object *obj)
 cpu->reset_sctlr = 0x0078;
 }
 
+#ifdef CONFIG_USER_ONLY
 static void arm_any_initfn(Object *obj)
 {
 ARMCPU *cpu = ARM_CPU(obj);
@@ -834,6 +835,7 @@ static void arm_any_initfn(Object *obj)
 set_feature(&cpu->env, ARM_FEATURE_V7MP);
 cpu->midr = 0x;
 }
+#endif
 
 typedef struct ARMCPUInfo {
 const char *name;
@@ -874,7 +876,9 @@ static const ARMCPUInfo arm_cpus[] = {
 { .name = "pxa270-b1",   .initfn = pxa270b1_initfn },
 { .name = "pxa270-c0",   .initfn = pxa270c0_initfn },
 { .name = "pxa270-c5",   .initfn = pxa270c5_initfn },
+#ifdef CONFIG_USER_ONLY
 { .name = "any", .initfn = arm_any_initfn },
+#endif
 };
 
 static void arm_cpu_class_init(ObjectClass *oc, void *data)
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 24/24] default-configs: Add config for aarch64-softmmu

2013-09-03 Thread Peter Maydell

Add a config for aarch64-softmmu; this enables building of this target.
The resulting executable doesn't know about any 64 bit CPUs, but all
the 32 bit CPUs and board models work.

Signed-off-by: Peter Maydell 
---
 default-configs/aarch64-softmmu.mak |   82 +++
 1 file changed, 82 insertions(+)
 create mode 100644 default-configs/aarch64-softmmu.mak

diff --git a/default-configs/aarch64-softmmu.mak 
b/default-configs/aarch64-softmmu.mak
new file mode 100644
index 000..175362f
--- /dev/null
+++ b/default-configs/aarch64-softmmu.mak
@@ -0,0 +1,82 @@
+# Default configuration for aarch64-softmmu
+
+include pci.mak
+include usb.mak
+CONFIG_GDBSTUB_XML=y
+CONFIG_VGA=y
+CONFIG_ISA_MMIO=y
+CONFIG_NAND=y
+CONFIG_ECC=y
+CONFIG_SERIAL=y
+CONFIG_PTIMER=y
+CONFIG_SD=y
+CONFIG_MAX7310=y
+CONFIG_WM8750=y
+CONFIG_TWL92230=y
+CONFIG_TSC2005=y
+CONFIG_LM832X=y
+CONFIG_TMP105=y
+CONFIG_STELLARIS=y
+CONFIG_STELLARIS_INPUT=y
+CONFIG_STELLARIS_ENET=y
+CONFIG_SSD0303=y
+CONFIG_SSD0323=y
+CONFIG_ADS7846=y
+CONFIG_MAX111X=y
+CONFIG_SSI=y
+CONFIG_SSI_SD=y
+CONFIG_SSI_M25P80=y
+CONFIG_LAN9118=y
+CONFIG_SMC91C111=y
+CONFIG_DS1338=y
+CONFIG_PFLASH_CFI01=y
+CONFIG_PFLASH_CFI02=y
+CONFIG_MICRODRIVE=y
+CONFIG_USB_MUSB=y
+
+CONFIG_ARM11MPCORE=y
+CONFIG_A9MPCORE=y
+CONFIG_A15MPCORE=y
+
+CONFIG_ARM_GIC=y
+CONFIG_ARM_GIC_KVM=$(CONFIG_KVM)
+CONFIG_ARM_TIMER=y
+CONFIG_ARM_MPTIMER=y
+CONFIG_PL011=y
+CONFIG_PL022=y
+CONFIG_PL031=y
+CONFIG_PL041=y
+CONFIG_PL050=y
+CONFIG_PL061=y
+CONFIG_PL080=y
+CONFIG_PL110=y
+CONFIG_PL181=y
+CONFIG_PL190=y
+CONFIG_PL310=y
+CONFIG_PL330=y
+CONFIG_CADENCE=y
+CONFIG_XGMAC=y
+CONFIG_EXYNOS4=y
+CONFIG_PXA2XX=y
+CONFIG_BITBANG_I2C=y
+CONFIG_FRAMEBUFFER=y
+CONFIG_XILINX_SPIPS=y
+
+CONFIG_A9SCU=y
+CONFIG_MARVELL_88W8618=y
+CONFIG_OMAP=y
+CONFIG_TSC210X=y
+CONFIG_BLIZZARD=y
+CONFIG_ONENAND=y
+CONFIG_TUSB6010=y
+CONFIG_IMX=y
+CONFIG_MAINSTONE=y
+CONFIG_NSERIES=y
+CONFIG_REALVIEW=y
+CONFIG_ZAURUS=y
+CONFIG_ZYNQ=y
+
+CONFIG_VERSATILE_PCI=y
+CONFIG_VERSATILE_I2C=y
+
+CONFIG_SDHCI=y
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 21/24] linux-user: Add AArch64 support

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

This patch adds support for AArch64 in all the small corners of
linux-user (primarily in image loading and startup code).

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-11-git-send-email-john.ri...@linaro.org
[PMM:
 * removed some unnecessary #defines from syscall.h
 * catch attempts to use a 32 bit only cpu with aarch64-linux-user
 * termios stuff moved into its own patch
 * we specify our minimum uname version here now
]
Signed-off-by: Peter Maydell 
---
 linux-user/aarch64/syscall.h |9 +
 linux-user/elfload.c |   15 +--
 linux-user/main.c|   16 
 3 files changed, 38 insertions(+), 2 deletions(-)
 create mode 100644 linux-user/aarch64/syscall.h

diff --git a/linux-user/aarch64/syscall.h b/linux-user/aarch64/syscall.h
new file mode 100644
index 000..aef419e
--- /dev/null
+++ b/linux-user/aarch64/syscall.h
@@ -0,0 +1,9 @@
+struct target_pt_regs {
+uint64_tregs[31];
+uint64_tsp;
+uint64_tpc;
+uint64_tpstate;
+};
+
+#define UNAME_MACHINE "aarch64"
+#define UNAME_MINIMUM_RELEASE "3.8.0"
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 7ce2eab..e2f40b9 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -269,16 +269,26 @@ static void elf_core_copy_regs(target_elf_gregset_t 
*regs, const CPUX86State *en
 
 #define ELF_START_MMAP 0x8000
 
-#define elf_check_arch(x) ( (x) == EM_ARM )
+#define elf_check_arch(x) ((x) == ELF_MACHINE)
 
+#define ELF_ARCHELF_MACHINE
+
+#ifdef TARGET_AARCH64
+#define ELF_CLASS   ELFCLASS64
+#else
 #define ELF_CLASS   ELFCLASS32
-#define ELF_ARCHEM_ARM
+#endif
 
 static inline void init_thread(struct target_pt_regs *regs,
struct image_info *infop)
 {
 abi_long stack = infop->start_stack;
 memset(regs, 0, sizeof(*regs));
+
+#ifdef TARGET_AARCH64
+regs->pc = infop->entry & ~0x3ULL;
+regs->sp = stack;
+#else
 regs->ARM_cpsr = 0x10;
 if (infop->entry & 1)
 regs->ARM_cpsr |= CPSR_T;
@@ -292,6 +302,7 @@ static inline void init_thread(struct target_pt_regs *regs,
 /* For uClinux PIC binaries.  */
 /* XXX: Linux does this only on ARM with no MMU (do we care ?) */
 regs->ARM_r10 = infop->start_data;
+#endif
 }
 
 #define ELF_NREG18
diff --git a/linux-user/main.c b/linux-user/main.c
index d530f01..e213085 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -3964,6 +3964,22 @@ int main(int argc, char **argv, char **envp)
 cpu_x86_load_seg(env, R_FS, 0);
 cpu_x86_load_seg(env, R_GS, 0);
 #endif
+#elif defined(TARGET_AARCH64)
+{
+int i;
+
+if (!(arm_feature(env, ARM_FEATURE_AARCH64))) {
+fprintf(stderr,
+"The selected ARM CPU does not support 64 bit mode\n");
+exit(1);
+}
+
+for (i = 0; i < 31; i++) {
+env->xregs[i] = regs->regs[i];
+}
+env->pc = regs->pc;
+env->xregs[31] = regs->sp;
+}
 #elif defined(TARGET_ARM)
 {
 int i;
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 12/24] linux-user: Don't treat AArch64 cpu names specially

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

32-bit ARM has a lot of different names for different types of CPUs it supports.
On AArch64, we don't have this, so we really don't want to execute the 32-bit
logic. Stub it out for AArch64 linux-user guests.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-7-git-send-email-john.ri...@linaro.org
Signed-off-by: Peter Maydell 
---
 linux-user/cpu-uname.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/linux-user/cpu-uname.c b/linux-user/cpu-uname.c
index cc713e6..5db6e89 100644
--- a/linux-user/cpu-uname.c
+++ b/linux-user/cpu-uname.c
@@ -30,7 +30,8 @@
  * return here */
 const char *cpu_to_uname_machine(void *cpu_env)
 {
-#ifdef TARGET_ARM
+#if defined(TARGET_ARM) && !defined(TARGET_AARCH64)
+
 /* utsname machine name on linux arm is CPU arch name + endianness, e.g.
  * armv7l; to get a list of CPU arch names from the linux source, use:
  * grep arch_name: -A1 linux/arch/arm/mm/proc-*.S
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 08/24] target-arm: Disable 32 bit CPUs in 64 bit linux-user builds

2013-09-03 Thread Peter Maydell

If we're building aarch64-linux-user then the 32 bit CPUs are
all unwanted, because they can't possibly execute the 64 bit
binaries we will be running; disable them.

Signed-off-by: Peter Maydell 
---
 target-arm/cpu.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 827e28e..67d285d 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -288,8 +288,6 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
 acc->parent_realize(dev, errp);
 }
 
-/* CPU models */
-
 static ObjectClass *arm_cpu_class_by_name(const char *cpu_model)
 {
 ObjectClass *oc;
@@ -309,6 +307,9 @@ static ObjectClass *arm_cpu_class_by_name(const char 
*cpu_model)
 return oc;
 }
 
+/* CPU models. These are not needed for the AArch64 linux-user build. */
+#if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
+
 static void arm926_initfn(Object *obj)
 {
 ARMCPU *cpu = ARM_CPU(obj);
@@ -837,6 +838,8 @@ static void arm_any_initfn(Object *obj)
 }
 #endif
 
+#endif /* !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64) */
+
 typedef struct ARMCPUInfo {
 const char *name;
 void (*initfn)(Object *obj);
@@ -844,6 +847,7 @@ typedef struct ARMCPUInfo {
 } ARMCPUInfo;
 
 static const ARMCPUInfo arm_cpus[] = {
+#if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
 { .name = "arm926",  .initfn = arm926_initfn },
 { .name = "arm946",  .initfn = arm946_initfn },
 { .name = "arm1026", .initfn = arm1026_initfn },
@@ -879,6 +883,7 @@ static const ARMCPUInfo arm_cpus[] = {
 #ifdef CONFIG_USER_ONLY
 { .name = "any", .initfn = arm_any_initfn },
 #endif
+#endif
 };
 
 static void arm_cpu_class_init(ObjectClass *oc, void *data)
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 23/24] default-configs: Add config for aarch64-linux-user

2013-09-03 Thread Peter Maydell

Add a config for aarch64-linux-user, thereby enabling it as
a valid target.

Signed-off-by: Peter Maydell 
---
 default-configs/aarch64-linux-user.mak |3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 default-configs/aarch64-linux-user.mak

diff --git a/default-configs/aarch64-linux-user.mak 
b/default-configs/aarch64-linux-user.mak
new file mode 100644
index 000..3df7de5
--- /dev/null
+++ b/default-configs/aarch64-linux-user.mak
@@ -0,0 +1,3 @@
+# Default configuration for aarch64-linux-user
+
+CONFIG_GDBSTUB_XML=y
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 15/24] linux-user: Fix up AArch64 syscall handlers

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

Some syscall handlers have special code for ARM enabled that we don't
need on AArch64. Exclude AArch64 in those cases. In other places we
can share struct definitions with other targets or have to provide our
own.

With this patch applied, most syscall definitions in linux-user should
be sound for AArch64.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
Message-id: 1368505980-17151-9-git-send-email-john.ri...@linaro.org
Signed-off-by: Peter Maydell 
---
 linux-user/syscall.c  |5 +++--
 linux-user/syscall_defs.h |   28 ++--
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f986548..73046b0 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -4737,7 +4737,7 @@ static inline abi_long host_to_target_stat64(void 
*cpu_env,
  abi_ulong target_addr,
  struct stat *host_st)
 {
-#ifdef TARGET_ARM
+#if defined(TARGET_ARM) && defined(TARGET_ABI32)
 if (((CPUARMState *)cpu_env)->eabi) {
 struct target_eabi_stat64 *target_st;
 
@@ -6394,7 +6394,8 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #endif
 #ifdef TARGET_NR_mmap
 case TARGET_NR_mmap:
-#if (defined(TARGET_I386) && defined(TARGET_ABI32)) || defined(TARGET_ARM) || \
+#if (defined(TARGET_I386) && defined(TARGET_ABI32)) || \
+(defined(TARGET_ARM) && defined(TARGET_ABI32)) || \
 defined(TARGET_M68K) || defined(TARGET_CRIS) || defined(TARGET_MICROBLAZE) 
\
 || defined(TARGET_S390X)
 {
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 086fbff..2ebe356 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -1137,7 +1137,8 @@ struct target_winsize {
 #define TARGET_MAP_UNINITIALIZED 0x400 /* for anonymous mmap, memory 
could be uninitialized */
 #endif
 
-#if (defined(TARGET_I386) && defined(TARGET_ABI32)) || defined(TARGET_ARM) \
+#if (defined(TARGET_I386) && defined(TARGET_ABI32)) \
+|| (defined(TARGET_ARM) && defined(TARGET_ABI32)) \
 || defined(TARGET_CRIS) || defined(TARGET_UNICORE32)
 struct target_stat {
unsigned short st_dev;
@@ -1835,6 +1836,28 @@ struct target_stat {
 abi_long   st_blocks;
 abi_ulong  __unused[3];
 };
+#elif defined(TARGET_AARCH64)
+struct target_stat {
+abi_ulong  st_dev;
+abi_ulong  st_ino;
+unsigned int st_mode;
+unsigned int st_nlink;
+unsigned int   st_uid;
+unsigned int   st_gid;
+abi_ulong  st_rdev;
+abi_ulong  _pad1;
+abi_long  st_size;
+intst_blksize;
+int__pad2;
+abi_long   st_blocks;
+abi_long  target_st_atime;
+abi_ulong  target_st_atime_nsec;
+abi_long  target_st_mtime;
+abi_ulong  target_st_mtime_nsec;
+abi_long  target_st_ctime;
+abi_ulong  target_st_ctime_nsec;
+unsigned int __unused[2];
+};
 #elif defined(TARGET_OPENRISC)
 
 /* These are the asm-generic versions of the stat and stat64 structures */
@@ -1943,7 +1966,8 @@ struct target_statfs64 {
uint32_tf_spare[6];
 };
 #elif (defined(TARGET_PPC64) || defined(TARGET_X86_64) || \
-   defined(TARGET_SPARC64)) && !defined(TARGET_ABI32)
+   defined(TARGET_SPARC64) || defined(TARGET_AARCH64)) && \
+   !defined(TARGET_ABI32)
 struct target_statfs {
abi_long f_type;
abi_long f_bsize;
-- 
1.7.9.5

[Qemu-devel] [PATCH v6 05/24] target-arm: Fix target_ulong/uint32_t confusions

2013-09-03 Thread Peter Maydell

From: Alexander Graf 

Correct a few places that were using uint32_t or a 32 bit
only format string to handle something that should be a target_ulong.

Signed-off-by: Alexander Graf 
Signed-off-by: John Rigby 
[PMM: split out to separate patch; added gen_goto_tb() and
gen_set_pc_im() dest params to list of things to change.]
Signed-off-by: Peter Maydell 
---
 target-arm/cpu.h   |4 ++--
 target-arm/translate.c |9 +
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index f2abdf3..8d1cc47 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -823,7 +823,7 @@ static inline bool cpu_has_work(CPUState *cpu)
 #include "exec/exec-all.h"
 
 /* Load an instruction and return it in the standard little-endian order */
-static inline uint32_t arm_ldl_code(CPUARMState *env, uint32_t addr,
+static inline uint32_t arm_ldl_code(CPUARMState *env, target_ulong addr,
 bool do_swap)
 {
 uint32_t insn = cpu_ldl_code(env, addr);
@@ -834,7 +834,7 @@ static inline uint32_t arm_ldl_code(CPUARMState *env, 
uint32_t addr,
 }
 
 /* Ditto, for a halfword (Thumb) instruction */
-static inline uint16_t arm_lduw_code(CPUARMState *env, uint32_t addr,
+static inline uint16_t arm_lduw_code(CPUARMState *env, target_ulong addr,
  bool do_swap)
 {
 uint16_t insn = cpu_lduw_code(env, addr);
diff --git a/target-arm/translate.c b/target-arm/translate.c
index a6adcc8..5a465fc 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -904,7 +904,7 @@ DO_GEN_ST(st8)
 DO_GEN_ST(st16)
 DO_GEN_ST(st32)
 
-static inline void gen_set_pc_im(uint32_t val)
+static inline void gen_set_pc_im(target_ulong val)
 {
 tcg_gen_movi_i32(cpu_R[15], val);
 }
@@ -3412,7 +3412,7 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext 
*s, uint32_t insn)
 return 0;
 }
 
-static inline void gen_goto_tb(DisasContext *s, int n, uint32_t dest)
+static inline void gen_goto_tb(DisasContext *s, int n, target_ulong dest)
 {
 TranslationBlock *tb;
 
@@ -9992,7 +9992,7 @@ static inline void gen_intermediate_code_internal(ARMCPU 
*cpu,
 uint16_t *gen_opc_end;
 int j, lj;
 target_ulong pc_start;
-uint32_t next_page_start;
+target_ulong next_page_start;
 int num_insns;
 int max_insns;
 
@@ -10146,7 +10146,8 @@ static inline void 
gen_intermediate_code_internal(ARMCPU *cpu,
 }
 
 if (tcg_check_temp_count()) {
-fprintf(stderr, "TCG temporary leak before %08x\n", dc->pc);
+fprintf(stderr, "TCG temporary leak before "TARGET_FMT_lx"\n",
+dc->pc);
 }
 
 /* Translation stops when a conditional branch is encountered.
-- 
1.7.9.5

[Qemu-devel] [PULL] s390: cleanups and fixes

2013-09-03 Thread Christian Borntraeger

Anthony,


The following changes since commit 4ff78e0dbcd5c795962567fdc1b31e9e03c55b07:

  Merge remote-tracking branch 'luiz/queue/qmp' into staging (2013-08-30 
12:26:04 -0500)

are available in the git repository at:


  git://github.com/borntraeger/qemu.git tags/s390-20130902

for you to fetch changes up to d66b1005d2ade6ce7854581aac6f3222f6dd7ea4:

  s390/ioinst: Moved the CC setting to the IO instruction handlers (2013-09-02 
16:55:14 +0200)


This is a bunch of cleanups and fixes for the s390 architecture.
Series is
Acked-by: Alexander Graf 


Christian Borntraeger (2):
  s390/dump: zero out padding bytes in notes sections
  s390/ipl: Update the s390-ccw.img rom

Cornelia Huck (1):
  s390/ipl: Fix waiting for virtio processing

Thomas Huth (3):
  s390/kvm: Add check for priviledged SCLP handler
  s390/cpu: Make setcc() function available to other files
  s390/ioinst: Moved the CC setting to the IO instruction handlers

 pc-bios/s390-ccw.img  | Bin 9432 -> 9336 bytes
 pc-bios/s390-ccw/virtio.c |   7 +--
 pc-bios/s390-ccw/virtio.h |   1 +
 target-s390x/arch_dump.c  |   1 +
 target-s390x/cpu.h|  11 -
 target-s390x/ioinst.c | 110 +-
 target-s390x/ioinst.h |  26 +--
 target-s390x/kvm.c|  54 ---
 8 files changed, 96 insertions(+), 114 deletions(-)

Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Corey Bryant




On 09/03/2013 02:21 PM, Paul Moore wrote:

On Tuesday, September 03, 2013 02:08:28 PM Corey Bryant wrote:

On 09/03/2013 02:02 PM, Corey Bryant wrote:

On 08/30/2013 10:21 AM, Eduardo Otubo wrote:

On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:

On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:

Now there's a second whitelist, right before the vcpu starts. The
second
whitelist is the same as the first one, except for exec() and select().


-netdev tap,downscript=/path/to/script requires exec() in the QEMU
shutdown code path.  Will this work with seccomp?


I actually don't know, but I'll test that as well. Can you run a test
with this patch and -netdev? I mean, if you're pointing that out you
might have a scenario already setup, right?

Thanks!


This uses exec() in net/tap.c.

I think if we're going to introduce a sandbox environment that restricts
existing QEMU behavior, then we have to introduce a new argument to the
-sandbox option.  So for example, "-sandbox on" would continue to use
the whitelist that allows everything in QEMU to work (or at least it
should :).  And something like "-sandbox on,strict=on" would use the
whitelist + blacklist.

If this is acceptable though, then I wonder how we could go about adding
new syscalls to the blacklist in future QEMU releases without regressing
"-sandbox on,strict=on".


Maybe a better approach is to provide support that allows libvirt to
define the blacklist and pass it to QEMU?


FYI: I didn't want to mention this until I had some patches ready to post, but
I'm currently working on adding syscall filtering, via libseccomp, to libvirt.
I hope to get an initial RFC-quality patch out "soon".



Great, looking forward to seeing them.

--
Regards,
Corey Bryant

Re: [Qemu-devel] [PATCH] configure: Enable extra compiler warnings

2013-09-03 Thread Stefan Weil

Am 21.08.2013 07:44, schrieb Stefan Weil:
> Compiler option -Wextra enables an additional set of compiler warnings.
>
> Some of these warnings were already enabled explicitly in QEMU:
> -Wold-style-declaration, -Wtype-limits, -Wignored-qualifiers and
> -Wempty-body are now redundant and can be removed.
>
> Others don't work with the current code and must be disabled to
> avoid warnings: -Wno-missing-field-initializers, -Wno-override-init,
> -Wno-sign-compare and -Wno-unused-parameter.
>
> Signed-off-by: Stefan Weil 
> ---
> This is a rather old and long tested patch: I use -Wextra in my
> QEMU builds for more than a year now. At least one bug was found
> by using this warning level (see commit
> b22dd1243f38286263d40496ce5298a8a7d96eea).
>
> My tests include Linux and Windows hosts (gcc), but not BSD based hosts,
> so maybe those hosts might need additional code fixes.
> clang reports lots of -Wunused-value warnings.
>
> Regards,
> Stefan
>
>  configure |   14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)

Ping? I'd appreciate getting this patch committed.
Is it trivial enough for qemu-trivial?

Stefan

See also http://patchwork.ozlabs.org/patch/268687/

Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Corey Bryant




On 09/03/2013 02:02 PM, Corey Bryant wrote:



On 08/30/2013 10:21 AM, Eduardo Otubo wrote:



On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:

On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:

Now there's a second whitelist, right before the vcpu starts. The
second
whitelist is the same as the first one, except for exec() and select().


-netdev tap,downscript=/path/to/script requires exec() in the QEMU
shutdown code path.  Will this work with seccomp?


I actually don't know, but I'll test that as well. Can you run a test
with this patch and -netdev? I mean, if you're pointing that out you
might have a scenario already setup, right?

Thanks!



This uses exec() in net/tap.c.

I think if we're going to introduce a sandbox environment that restricts
existing QEMU behavior, then we have to introduce a new argument to the
-sandbox option.  So for example, "-sandbox on" would continue to use
the whitelist that allows everything in QEMU to work (or at least it
should :).  And something like "-sandbox on,strict=on" would use the
whitelist + blacklist.

If this is acceptable though, then I wonder how we could go about adding
new syscalls to the blacklist in future QEMU releases without regressing
"-sandbox on,strict=on".


Maybe a better approach is to provide support that allows libvirt to 
define the blacklist and pass it to QEMU?




By the way, are any test buckets running regularly with -sandbox on?



--
Regards,
Corey Bryant

[Qemu-devel] Quriky/Permissive PCI device in KVM

2013-09-03 Thread Saurabh Mishra

Hi,

We have a quirky PCI device which requires PCI config write access. We had
modified /etc/xen/xend-pci-permissive.sxp and /etc/xen/xend-pci-quirks.sxp
to give full access to PCI config space of our home grown PCI device for
Xen.


Kindly let me know how we can mention the same in KVM since we are planning
to explore KVM also.


I'm using virsh create  method to create a HVM guest.

Thanks,
/Saurabh

Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Paul Moore

On Tuesday, September 03, 2013 02:08:28 PM Corey Bryant wrote:
> On 09/03/2013 02:02 PM, Corey Bryant wrote:
> > On 08/30/2013 10:21 AM, Eduardo Otubo wrote:
> >> On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:
> >>> On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:
>  Now there's a second whitelist, right before the vcpu starts. The
>  second
>  whitelist is the same as the first one, except for exec() and select().
> >>> 
> >>> -netdev tap,downscript=/path/to/script requires exec() in the QEMU
> >>> shutdown code path.  Will this work with seccomp?
> >> 
> >> I actually don't know, but I'll test that as well. Can you run a test
> >> with this patch and -netdev? I mean, if you're pointing that out you
> >> might have a scenario already setup, right?
> >> 
> >> Thanks!
> > 
> > This uses exec() in net/tap.c.
> > 
> > I think if we're going to introduce a sandbox environment that restricts
> > existing QEMU behavior, then we have to introduce a new argument to the
> > -sandbox option.  So for example, "-sandbox on" would continue to use
> > the whitelist that allows everything in QEMU to work (or at least it
> > should :).  And something like "-sandbox on,strict=on" would use the
> > whitelist + blacklist.
> > 
> > If this is acceptable though, then I wonder how we could go about adding
> > new syscalls to the blacklist in future QEMU releases without regressing
> > "-sandbox on,strict=on".
> 
> Maybe a better approach is to provide support that allows libvirt to
> define the blacklist and pass it to QEMU?

FYI: I didn't want to mention this until I had some patches ready to post, but 
I'm currently working on adding syscall filtering, via libseccomp, to libvirt.  
I hope to get an initial RFC-quality patch out "soon".

-- 
paul moore
security and virtualization @ redhat

Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Corey Bryant




On 08/30/2013 10:21 AM, Eduardo Otubo wrote:



On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:

On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:

Now there's a second whitelist, right before the vcpu starts. The second
whitelist is the same as the first one, except for exec() and select().


-netdev tap,downscript=/path/to/script requires exec() in the QEMU
shutdown code path.  Will this work with seccomp?


I actually don't know, but I'll test that as well. Can you run a test
with this patch and -netdev? I mean, if you're pointing that out you
might have a scenario already setup, right?

Thanks!



This uses exec() in net/tap.c.

I think if we're going to introduce a sandbox environment that restricts 
existing QEMU behavior, then we have to introduce a new argument to the 
-sandbox option.  So for example, "-sandbox on" would continue to use 
the whitelist that allows everything in QEMU to work (or at least it 
should :).  And something like "-sandbox on,strict=on" would use the 
whitelist + blacklist.


If this is acceptable though, then I wonder how we could go about adding 
new syscalls to the blacklist in future QEMU releases without regressing 
"-sandbox on,strict=on".


By the way, are any test buckets running regularly with -sandbox on?

--
Regards,
Corey Bryant



Stefan

Re: [Qemu-devel] [Qemu-trivial] [PATCH] cputlb: remove dead function tlb_update_dirty

2013-09-03 Thread Andreas Färber

Am 03.09.2013 13:17, schrieb Michael Tokarev:
> 03.09.2013 12:35, Andreas Färber wrote:
>> I also don't understand why qemu-trivial is suddenly picking up Stefan's
>> arm translation patch, it used to be for unmaintained areas only. But
>> arm is not my problem.
> 
> Which patch you're talking about?  Is it "target-arm: Report unimplemented
> opcodes (LOG_UNIMP)" ?

Yes.

>  If yes, that one appears to be trivial as it just
> adds some logging before failing an instruction and should not conflict
> with other work being done in this area.  Perhaps I was too aggressive
> while picking up the backlog.  We should just draw the line *somewhere*, --

Right, that line is what I'm reminding about here. I feel that lately an
increasing number of contributors and reviewers are deferring patches to
qemu-trivial that don't really belong there IMO. That Anthony doesn't
scale to cover Blue's maintainer work as well shouldn't lead to a surge
on qemu-trivial.

> eg, it sure is possible to reject spelling fixes for maintained areas
> from -trivial (like this arm tree), - will this be productive?

No, spelling fixes are not a concern to me as they are rather unlikely
to cause conflicts with patches being queued by submaintainers. :)

> This change (cputlb: remove dead function) appears to be "trivial enough"
> for me (after looking at the usage history of this function), and I'd
> pick it up without this Andreas's request, too.

Yes. This one here would've been okay usually, as there is no official
maintainer for cputlb.c and it's trivial in the sense that a git-grep
confirms it to be okay. I was just annoyed that I had to defer my pull
twice (sent it out now) because s390x added two CPU loops and then once
that was merged ppc added another loop, too. My upcoming 35+ patch
series qom-cpu-13 may hopefully explain the rest once you see it.

> As for the "suddenly" - it's not really suddenly, it's because it
> has been Cc'd to -trivial (by someone who submitted lots of good
> trivial patches before) and actually looks trivial, too.  And also
> because subsystem maintainer added his Reviewed-by, apparently (or
> hopefully) after noticing it's submitted to -trivial.  I also Cc'd
> both maintainers in my notice that it's been applied to -trivial.

"Suddenly" in the sense that the prupose of qemu-trivial used to be
handling patches that would otherwise fall through the cracks.

So by my understanding, e.g., "target-arm:" => !trivial, and I would've
expected there to be some on-list communication between PMM and you
before CC'ing someone on a "thanks, applied" after the fact.
By contrast, if there's a change to configure or "Fix spelling of" etc.
then you picking it up is highly appreciated. I just don't want
qemu-trivial becoming the least-resistance way of getting patches into
qemu.git that might otherwise get bounced/changed by submaintainers.

Also, I am seeing Paolo pull in huge memory changes but now pinging the
breakage fixes rather than assembling a pull to fix the fallout. ;)

Similarly target-i386 TCG is not suited for qemu-trivial IMO, instead
rth or someone who works on and/or reviews it (rth?) should volunteer as
proper maintainer. With the larger part of the community using KVM these
days, we simply can't have that be handled by the community at large any
more.

So yes, I know you were on vacation and you seem eager to take up work
again, that's great; I'm just cautioning that CC'ing everything on
qemu-trivial (not your fault, you're on the receiving end) can't be the
new solution, so feel encouraged to push back a little. :)

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

[Qemu-devel] [Bug 977391] Re: BUG: soft lockup - CPU#8 stuck for 61s! [kvm:*] in lucid

2013-09-03 Thread Christopher M. Penalver

Rahul, this bug was reported a while ago and there hasn't been any
activity in it recently. We were wondering if this is still an issue? If
so, could you please test for this with the latest server release of
Ubuntu? ISO images are available from http://releases.ubuntu.com/raring/
.

If it remains an issue, could you please run the following command in
the development release from a Terminal
(Applications->Accessories->Terminal), as it will automatically gather
and attach updated debug information to this report:

apport-collect -p linux 

Also, could you please test the latest upstream kernel available following 
https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional 
upstream developers to examine the issue. Please do not test the daily folder, 
but the one all the way at the bottom. Once you've tested the upstream kernel, 
please comment on which kernel version specifically you tested. If this bug is 
fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For 
example:
kernel-fixed-upstream-v3.11-rc7

This can be done by clicking on the yellow circle with a black pencil icon next 
to the word Tags located at the bottom of the bug description. As well, please 
remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's
Status as Confirmed. Please let us know your results. Thank you for your
understanding.

** Tags added: needs-kernel-logs needs-upstream-testing

** Changed in: linux (Ubuntu)
   Status: Confirmed => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/977391

Title:
  BUG: soft lockup - CPU#8 stuck for 61s! [kvm:*]   in lucid

Status in QEMU:
  Confirmed
Status in “linux” package in Ubuntu:
  Incomplete
Status in “qemu-kvm” package in Ubuntu:
  Confirmed

Bug description:
  Two days back  my KVM base machine got hung up all of a sudden.
  Not sure what exactly happened.

  cat /proc/version_signature 
  Ubuntu 2.6.32-28.55-server 2.6.32.27+drm33.12

  
  -Rahul N.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/977391/+subscriptions

[Qemu-devel] Block Filters

2013-09-03 Thread Benoît Canet


Hello list,

I am thinking about QEMU block filters lately.

I am not a block.c/blockdev.c expert so tell me what you think of the following.

The use cases I see would be:

-$user want to have some real cryptography on top of qcow2/qed or another
format.
 snapshots and other block features should continue to work

-$user want to use a raid like feature like QUORUM in QEMU.
 other features should continue to work

-$user want to use the future SSD deduplication implementation with metadata on
SSD and data on spinning disks.
 other features should continue to work

-$user want to I/O throttle one drive of his vm.

-$user want to do Copy On Read

-$user want to do a combination of the above

-$developer want to make the minimum of required steps to keep changes small

-$developer want to keep user interface changes for later

Lets take a example case of an user wanting to do I/O throttled encrypted QUORUM
on top of QCOW2.

Assuming we want to implement throttle and encryption as something remotely
being like a block filter this makes a pretty complex BlockDriverState tree.

The tree would look like the following:

I/O throttling BlockDriverState (bs)
   |
   |
   |
   |
Encryption BlockDriverState (bs)
   |
   |
   |
   |
Quorum BlockDriverState (bs)
   /   |   \
  /|\
 / |     \
/  |  \
QCOW2 bs   QCOW2 b s   QCOW2 bs
   |   |   |
   |   |   |
   |   |   |
   |   |   |
RAW bs RAW bs   RAW bs

An external snapshot should result in a tree like the following.
I/O throttling BlockDriverState (bs)
   |
   |
   |
   |
Encryption BlockDriverState (bs)
   |
   |
   |
   |
Quorum BlockDriverState (bs)
   /   |   \
  /|\
 / |     \
/  |  \
QCOW2 bs   QCOW2 bs QCOW2 bs
   |   |   |
   |   |   |
   |   |   |
   |   |   |
QCOW2 bs   QCOW2 bs QCOW2 bs
   |   |   |
   |   |   |
   |   |   |
   |   |   |
RAW bs RAW bs   RAW bs

In the current state of QEMU we can code some block drivers to implement this
tree.

However when doing operations like snapshots blockdev.c would have no real idea
of what should be snapshotted and how. (The 3 top bs should be kept on top)

Moreover it would have no way to manipulate easily this tree of BlockDriverState
has each one is encapsulated in it's parent.

Also there no generic way to tell the block layer that two or more 
BlockDriverState
are siblings.

The current mail is here to propose some additionals structures in order to cope
with these problems.

The overall strategy of the proposed structures is to push out the
BlockDriverStates relationships out of each BlockDriverState.

The idea is that it would make it easier for the block layer to manipulate a
well known structure instead of being forced to enter into each BlockDriverState
specificity.

The first structure is the BlockStackNode.

The BlockStateNode would be used to represent the relationship between the
various BlockDriverStates

struct BlockStackNode {
BlockDriverState *bs;  /* the BlockDriverState holded by this node */

/* this doubly linked list entry points to the child node and the parent
 * node
 */
QLIST_ENTRY(BlockStateNode) down;

/* This doubly linked list entry point to the siblings of this node
 */
QLIST_ENTRY(BlockStateNode) siblings;

/* a hash or an array of the sibbling of this node for fast access
 * should be recomputed when updating the tree */
QHASH_ENTRY sibblings_hash;
}

The BlockBackend would be the structure used to hold the "drive" the guest use.

struct BlockBackend {
/* the following doubly linked list header point to the top BlockStackNode
 * in our c

Re: [Qemu-devel] [PATCH] linux-headers: update to 3.11

2013-09-03 Thread Paolo Bonzini

Il 03/09/2013 17:28, Alexey Kardashevskiy ha scritto:
> On 09/03/2013 08:42 PM, Jan Kiszka wrote:
>> On 2013-09-03 11:32, Alexey Kardashevskiy wrote:
>>> On 09/03/2013 07:29 PM, Peter Maydell wrote:
 On 3 September 2013 09:27, Alexey Kardashevskiy  wrote:
> Signed-off-by: Alexey Kardashevskiy 
> ---
>
> I need this update as VFIO on PPC64/pseries got in upstream kernel
> and this is required by VFIO-SPAPR bits in QEMU. Others may find this
> update useful too :)
> ---
>  linux-headers/asm-arm64/kvm.h   | 168 
> 
>  linux-headers/asm-arm64/kvm_para.h  |   1 +
>  linux-headers/asm-mips/kvm.h|  81 +
>  linux-headers/linux/kvm.h   |   3 +
>  linux-headers/linux/vfio.h  |  42 -
>  linux-headers/linux/virtio_config.h |   3 +
>  6 files changed, 254 insertions(+), 44 deletions(-)
>  create mode 100644 linux-headers/asm-arm64/kvm.h
>  create mode 100644 linux-headers/asm-arm64/kvm_para.h

 I think this should go in via the KVM tree, not trivial.
>>>
>>> I do not mind, it just went through the trivial tree last time, that's it.
>>
>> This shouldn't be routed through trivial in general as things broke too
>> often in this area.
> 
> Sorry for my ignorance, but this is The Kernel, it is already there, broken
> or not, even if it is broken, qemu cannot stay isolated, no?
> This is a mechanical change, no more.

It's a matter of keeping things bisectable.  If we can detect a
breakage, we can first work around it, and then apply the header update.
 And if we don't detect it, maintainers usually send pull requests when
they have time to work on breakage caused by their patches.

Paolo

Re: [Qemu-devel] [PATCH v6] kvm irqfd: support direct msimessage to irq translation

2013-09-03 Thread Alexander Graf


On 09/03/2013 10:17 AM, Michael S. Tsirkin wrote:

On Tue, Sep 03, 2013 at 06:08:25PM +1000, Alexey Kardashevskiy wrote:

On PPC64 systems MSI Messages are translated to system IRQ in a PCI
host bridge. This is already supported for emulated MSI/MSIX but
not for irqfd where the current QEMU allocates IRQ numbers from
irqchip and maps MSIMessages to IRQ in the host kernel.

This adds a new direct mapping flag which tells
the kvm_irqchip_add_msi_route() function that a new VIRQ
should not be allocated, instead the value from MSIMessage::data
should be used. It is up to the platform code to make sure that
this contains a valid IRQ number as sPAPR does in spapr_pci.c.

Signed-off-by: Alexey Kardashevskiy

Fine with me

Acked-by: Michael S. Tsirkin


Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [PATCH] linux-headers: update to 3.11

2013-09-03 Thread Alex Williamson

On Wed, 2013-09-04 at 01:47 +1000, Alexey Kardashevskiy wrote:
> On 09/04/2013 01:34 AM, Peter Maydell wrote:
> > On 3 September 2013 16:28, Alexey Kardashevskiy  wrote:
> >> On 09/03/2013 08:42 PM, Jan Kiszka wrote:
> >>> This shouldn't be routed through trivial in general as things broke too
> >>> often in this area.
> >>
> >>
> >> Sorry for my ignorance, but this is The Kernel, it is already there, broken
> >> or not, even if it is broken, qemu cannot stay isolated, no?
> >> This is a mechanical change, no more.
> > 
> > The classic way for things to break is that a header
> > update accidentally reverts something (because a
> > previous update was from kvm-next and this one is
> > from mainline, for example). Accidental updates against
> > a kernel which is neither kvm-next nor mainline are
> > the other common "broken" version of a header update
> > patch.
> 
> I can understand that but this update is a mainline kernel update and it is
> not an accidental one but very specific :-/

I was under the impression that we were only ever updating linux-headers
from mainline, never from kvm-next.  Therefore any mainline tag should
be a reasonable re-base target.  Thanks,

Alex

[Qemu-devel] [uq/master][PATCH 2/3] kvmvapic: Enter inactive state on hardware reset

2013-09-03 Thread Jan Kiszka

ROM layout may change after reset of devices are hotplugged, so we have
to pick up the physical address again when the ROM is initialized. This
is best achieved by resetting the state to INACTIVE.

CC: qemu-sta...@nongnu.org
Signed-off-by: Jan Kiszka 
---
 hw/i386/kvmvapic.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 7ac0fe1..f2e335d 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -510,9 +510,7 @@ static void vapic_reset(DeviceState *dev)
 {
 VAPICROMState *s = VAPIC(dev);
 
-if (s->state == VAPIC_ACTIVE) {
-s->state = VAPIC_STANDBY;
-}
+s->state = VAPIC_INACTIVE;
 vapic_enable_tpr_reporting(false);
 }
 
-- 
1.8.1.1.298.ge7eed54

1 2 3 >

1 - 100 of 256 matches

Mail list logo