Re: [Qemu-devel] [PATCH V4 0/5] export internal snapshot by qemu-nbd

2013-11-07 Thread Wenchao Xia
  Hello, rebase on upstream?




Re: [Qemu-devel] [PATCH 1/2] i386/pc: propagate flash size from pc_system_flash_init() to pc_init1()

2013-11-07 Thread Jordan Justen
On Thu, Nov 7, 2013 at 2:23 PM, Laszlo Ersek  wrote:
> ... upwards through the following call chain:
>
>   pc_init1() | pc_q35_init()
> pc_memory_init()
>   pc_system_firmware_init()
> pc_system_flash_init()
>
> Signed-off-by: Laszlo Ersek 
> ---
>  include/hw/i386/pc.h |  6 --
>  hw/i386/pc.c |  5 +++--
>  hw/i386/pc_piix.c|  3 ++-
>  hw/i386/pc_q35.c |  3 ++-
>  hw/i386/pc_sysfw.c   | 10 +++---
>  5 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 03cc0ba..a9b938e 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -148,7 +148,8 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
> ram_addr_t above_4g_mem_size,
> MemoryRegion *rom_memory,
> MemoryRegion **ram_memory,
> -   PcGuestInfo *guest_info);
> +   PcGuestInfo *guest_info,
> +   int64_t **flash_size);

Do you want *flash_size here, and at various other points? After
changing that I could build, and it seemed to fix the symptom.

Also, should we call it firmware_size?

int64_t? :)

-Jordan

>  qemu_irq *pc_allocate_cpu_irq(void);
>  DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus);
>  void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
> @@ -232,7 +233,8 @@ static inline bool isa_ne2000_init(ISABus *bus, int base, 
> int irq, NICInfo *nd)
>
>  /* pc_sysfw.c */
>  void pc_system_firmware_init(MemoryRegion *rom_memory,
> - bool isapc_ram_fw);
> + bool isapc_ram_fw,
> + int64_t **flash_size);
>
>  /* pvpanic.c */
>  void pvpanic_init(ISABus *bus);
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 12c436e..3ec18aa 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1152,7 +1152,8 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
> ram_addr_t above_4g_mem_size,
> MemoryRegion *rom_memory,
> MemoryRegion **ram_memory,
> -   PcGuestInfo *guest_info)
> +   PcGuestInfo *guest_info,
> +   int64_t **flash_size)
>  {
>  int linux_boot, i;
>  MemoryRegion *ram, *option_rom_mr;
> @@ -1186,7 +1187,7 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
>
>
>  /* Initialize PC system firmware */
> -pc_system_firmware_init(rom_memory, guest_info->isapc_ram_fw);
> +pc_system_firmware_init(rom_memory, guest_info->isapc_ram_fw, 
> flash_size);
>
>  option_rom_mr = g_malloc(sizeof(*option_rom_mr));
>  memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE);
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 4fdb7b6..6e2c027 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -89,6 +89,7 @@ static void pc_init1(QEMUMachineInitArgs *args,
>  DeviceState *icc_bridge;
>  FWCfgState *fw_cfg = NULL;
>  PcGuestInfo *guest_info;
> +int64_t flash_size = 0;
>
>  if (xen_enabled() && xen_hvm_init(&ram_memory) != 0) {
>  fprintf(stderr, "xen hardware virtual machine initialisation 
> failed\n");
> @@ -135,7 +136,7 @@ static void pc_init1(QEMUMachineInitArgs *args,
> args->kernel_filename, args->kernel_cmdline,
> args->initrd_filename,
> below_4g_mem_size, above_4g_mem_size,
> -   rom_memory, &ram_memory, guest_info);
> +   rom_memory, &ram_memory, guest_info, &flash_size);
>  }
>
>  gsi_state = g_malloc0(sizeof(*gsi_state));
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 4c191d3..90f29e9 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -76,6 +76,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
>  PCIDevice *ahci;
>  DeviceState *icc_bridge;
>  PcGuestInfo *guest_info;
> +int64_t flash_size = 0;
>
>  if (xen_enabled() && xen_hvm_init(&ram_memory) != 0) {
>  fprintf(stderr, "xen hardware virtual machine initialisation 
> failed\n");
> @@ -120,7 +121,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
> args->kernel_filename, args->kernel_cmdline,
> args->initrd_filename,
> below_4g_mem_size, above_4g_mem_size,
> -   rom_memory, &ram_memory, guest_info);
> +   rom_memory, &ram_memory, guest_info, &flash_size);
>  }
>
>  /* irq lines */
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index e917c83..0d05dec 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -73,7 +73,8 @@ static void pc_isa_bios_init(MemoryRegion *rom_memory,
>  }
>
>  static void pc_system_flash_init(MemoryRegion *rom_memory,
> - 

[Qemu-devel] [PULL 1.7 0/2] target-xtensa fixes

2013-11-07 Thread Max Filippov
Hi Anthony,

Please pull my current target-xtensa patch queue.
There are two fixes: one adds missing features for dc233c xtensa core,
another fixes qemu abort caused by gdb (generic breakpoint_invalidate,
reviewed by Paolo).

The following changes since commit 964668b03d26f0b5baa5e5aff0c966f4fcb76e9e:

  Update version for 1.7.0-rc0 release (2013-11-06 21:49:39 -0800)

are available in the git repository at:

  git://github.com/OSLL/qemu-xtensa.git tags/20131108-xtensa

for you to fetch changes up to ebbfd5a094ead6f28d45718f97b4837bd3e7f916:

  target-xtensa: add missing DEBUG section to dc233c config (2013-11-08 
09:26:07 +0400)


Two small fixes for 1.7:

- add missing debug feature to dc233c xtensa core;
- fix qemu abort caused by gdb attempt to invalidate a breakpoint by
  virtual address for which there's no mapping.


Max Filippov (2):
  exec: fix breakpoint_invalidate when pc may not be translated
  target-xtensa: add missing DEBUG section to dc233c config

 exec.c  | 6 --
 target-xtensa/core-dc233c.c | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

-- 
1.8.1.4




[Qemu-devel] [PULL 1.7 1/2] spapr: add vio-bus devices to categories

2013-11-07 Thread Alexander Graf
From: Alexey Kardashevskiy 

In order to get devices appear in output of
"./qemu-system-ppc64 -device ?",
they must be assigned to one of DEVICE_CATEGORY_.

This puts VIO devices classes to corresponding categories.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: Alexander Graf 
---
 hw/char/spapr_vty.c| 1 +
 hw/net/spapr_llan.c| 1 +
 hw/nvram/spapr_nvram.c | 1 +
 hw/scsi/spapr_vscsi.c  | 1 +
 4 files changed, 4 insertions(+)

diff --git a/hw/char/spapr_vty.c b/hw/char/spapr_vty.c
index 9c2aef8..f8a4981 100644
--- a/hw/char/spapr_vty.c
+++ b/hw/char/spapr_vty.c
@@ -168,6 +168,7 @@ static void spapr_vty_class_init(ObjectClass *klass, void 
*data)
 k->dt_name = "vty";
 k->dt_type = "serial";
 k->dt_compatible = "hvterm1";
+set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 dc->props = spapr_vty_properties;
 dc->vmsd = &vmstate_spapr_vty;
 }
diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c
index 4ff0411..1bd6f50 100644
--- a/hw/net/spapr_llan.c
+++ b/hw/net/spapr_llan.c
@@ -531,6 +531,7 @@ static void spapr_vlan_class_init(ObjectClass *klass, void 
*data)
 k->dt_type = "network";
 k->dt_compatible = "IBM,l-lan";
 k->signal_mask = 0x1;
+set_bit(DEVICE_CATEGORY_NETWORK, dc->categories);
 dc->props = spapr_vlan_properties;
 k->rtce_window_size = 0x1000;
 dc->vmsd = &vmstate_spapr_llan;
diff --git a/hw/nvram/spapr_nvram.c b/hw/nvram/spapr_nvram.c
index eb4500e..beaad68 100644
--- a/hw/nvram/spapr_nvram.c
+++ b/hw/nvram/spapr_nvram.c
@@ -182,6 +182,7 @@ static void spapr_nvram_class_init(ObjectClass *klass, void 
*data)
 k->dt_name = "nvram";
 k->dt_type = "nvram";
 k->dt_compatible = "qemu,spapr-nvram";
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 dc->props = spapr_nvram_properties;
 }
 
diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index 2a26042..c0c46d7 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -1223,6 +1223,7 @@ static void spapr_vscsi_class_init(ObjectClass *klass, 
void *data)
 k->dt_type = "vscsi";
 k->dt_compatible = "IBM,v-scsi";
 k->signal_mask = 0x0001;
+set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 dc->props = spapr_vscsi_properties;
 k->rtce_window_size = 0x1000;
 dc->vmsd = &vmstate_spapr_vscsi;
-- 
1.8.1.4




[Qemu-devel] [PULL 1.7 0/2] ppc patch queue 2013-11-08

2013-11-07 Thread Alexander Graf
Hi Blue / Aurelien / Anthony,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit 964668b03d26f0b5baa5e5aff0c966f4fcb76e9e:

  Update version for 1.7.0-rc0 release (2013-11-06 21:49:39 -0800)

are available in the git repository at:

  git://github.com/agraf/qemu.git tags/signed-ppc-for-upstream-1.7

for you to fetch changes up to 9d0e1dac108ce90cbe62c89af57a7ace006f5152:

  pseries: Update SLOF firmware image (2013-11-08 04:33:19 +0100)


Patch queue for ppc - 2013-11-08

These are two patches that will hopefully make it into 1.7. The SLOF update
fixes -append kernel command line argument passing into the guest kernel. The
other patch makes VIO devices appear when using -device '?'.


Alexey Kardashevskiy (2):
  spapr: add vio-bus devices to categories
  pseries: Update SLOF firmware image

 hw/char/spapr_vty.c|   1 +
 hw/net/spapr_llan.c|   1 +
 hw/nvram/spapr_nvram.c |   1 +
 hw/scsi/spapr_vscsi.c  |   1 +
 pc-bios/README |   2 +-
 pc-bios/slof.bin   | Bin 875424 -> 873920 bytes
 roms/SLOF  |   2 +-
 7 files changed, 6 insertions(+), 2 deletions(-)



[Qemu-devel] [PATCH v4] spapr: make sure RMA is in first mode of first memory node

2013-11-07 Thread Alexey Kardashevskiy
The SPAPR specification says that the RMA starts at the LPAR's logical
address 0 and is the first logical memory block reported in
the LPAR’s device tree.

So SLOF only maps the first block and that block needs to span
the full RMA.

This makes sure that the RMA area is where SLOF expects it.

Cc: Thomas Huth 
Cc: Benjamin Herrenschmidt 
Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v4:
* fixed a bug with preallocated RMA (thanks to Thomas Huth)

v3:
* removed unnecessary RMA fixup from spapr_populate_memory()

v2:
* changed as recommended by Alex Graf
---
 hw/ppc/spapr.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7e53a5f..036246c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -532,9 +532,6 @@ static int spapr_populate_memory(sPAPREnvironment *spapr, 
void *fdt)
 
 /* memory node(s) */
 node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
-if (spapr->rma_size > node0_size) {
-spapr->rma_size = node0_size;
-}
 
 /* RMA */
 mem_reg_property[0] = 0;
@@ -1113,6 +1110,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 MemoryRegion *sysmem = get_system_memory();
 MemoryRegion *ram = g_new(MemoryRegion, 1);
 hwaddr rma_alloc_size;
+hwaddr node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
 uint32_t initrd_base = 0;
 long kernel_size = 0, initrd_size = 0;
 long load_limit, rtas_limit, fw_size;
@@ -1134,10 +1132,10 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 exit(1);
 }
 
-if (rma_alloc_size && (rma_alloc_size < ram_size)) {
+if (rma_alloc_size && (rma_alloc_size < node0_size)) {
 spapr->rma_size = rma_alloc_size;
 } else {
-spapr->rma_size = ram_size;
+spapr->rma_size = node0_size;
 
 /* With KVM, we don't actually know whether KVM supports an
  * unbounded RMA (PR KVM) or is limited by the hash table size
@@ -1154,6 +1152,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 }
 }
 
+if (spapr->rma_size > node0_size) {
+fprintf(stderr, "Error: Numa node 0 has to span the RMA 
(%#08"HWADDR_PRIx")\n",
+spapr->rma_size);
+exit(1);
+}
+
 /* We place the device tree and RTAS just below either the top of the RMA,
  * or just below 2GB, whichever is lowere, so that it can be
  * processed with 32-bit real mode code if necessary */
-- 
1.8.4.rc4




[Qemu-devel] [PATCH for-1.7 v2] block: Print its file name if backing file opening failed

2013-11-07 Thread Fam Zheng
If backing file doesn't exist, the error message is confusing and
misleading:

$ qemu /tmp/a.qcow2
qemu: could not open disk image /tmp/a.qcow2: Could not open file: No
such file or directory

But...

$ ls /tmp/a.qcow2
/tmp/a.qcow2

$ qemu-img info /tmp/a.qcow2
image: /tmp/a.qcow2
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 196K
cluster_size: 65536
backing file: /tmp/b.qcow2

Because...

$ ls /tmp/b.qcow2
ls: cannot access /tmp/b.qcow2: No such file or directory

This is not intuitive. It's better to have the missing file's name in
the error message. With this patch:

$ qemu-io -c 'read 0 512' /tmp/a.qcow2
qemu-io: can't open device /tmp/a.qcow2: Could not open backing
file: Could not open '/stor/vm/arch.raw': No such file or directory
no file open, try 'help open'

Which is a little bit better.

Signed-off-by: Fam Zheng 

---
v2: Don't leak local_err (Eric).

Signed-off-by: Fam Zheng 
---
 block.c| 4 +++-
 block/raw-posix.c  | 1 -
 block/raw-win32.c  | 1 -
 tests/qemu-iotests/051.out | 2 +-
 tests/qemu-iotests/069.out | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index f706634..7ffb08b 100644
--- a/block.c
+++ b/block.c
@@ -1009,7 +1009,9 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 bdrv_unref(bs->backing_hd);
 bs->backing_hd = NULL;
 bs->open_flags |= BDRV_O_NO_BACKING;
-error_propagate(errp, local_err);
+error_setg(errp, "Could not open backing file: %s",
+   error_get_pretty(local_err));
+error_free(local_err);
 return ret;
 }
 pstrcpy(bs->backing_file, sizeof(bs->backing_file),
diff --git a/block/raw-posix.c b/block/raw-posix.c
index f6d48bb..e1b1ba2 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -310,7 +310,6 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 if (ret == -EROFS) {
 ret = -EACCES;
 }
-error_setg_errno(errp, -ret, "Could not open file");
 goto fail;
 }
 s->fd = fd;
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 2741e4d..2bad5a3 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -280,7 +280,6 @@ static int raw_open(BlockDriverState *bs, QDict *options, 
int flags,
 } else {
 ret = -EINVAL;
 }
-error_setg_errno(errp, -ret, "Could not open file");
 goto fail;
 }
 
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index 15deef6..d351935 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -226,6 +226,6 @@ Testing: -drive file=foo:bar
 QEMU_PROG: -drive file=foo:bar: could not open disk image foo:bar: Unknown 
protocol
 
 Testing: -drive file.filename=foo:bar
-QEMU_PROG: -drive file.filename=foo:bar: could not open disk image ide0-hd0: 
Could not open file: No such file or directory
+QEMU_PROG: -drive file.filename=foo:bar: could not open disk image ide0-hd0: 
Could not open 'foo:bar': No such file or directory
 
 *** done
diff --git a/tests/qemu-iotests/069.out b/tests/qemu-iotests/069.out
index 3648814..b48306d 100644
--- a/tests/qemu-iotests/069.out
+++ b/tests/qemu-iotests/069.out
@@ -4,5 +4,5 @@ QA output created by 069
 
 Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=131072 
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 
backing_file='TEST_DIR/t.IMGFMT.base' 
-qemu-io: can't open device TEST_DIR/t.IMGFMT: Could not open file: No such 
file or directory
+qemu-io: can't open device TEST_DIR/t.IMGFMT: Could not open backing file: 
Could not open 'TEST_DIR/t.IMGFMT.base': No such file or directory
 *** done
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] PPC: fix PCI configuration space MemoryRegions for grackle/uninorth

2013-11-07 Thread Alexander Graf

On 11.10.2013, at 12:53, Mark Cave-Ayland  wrote:

> OpenBIOS prior to SVN r1225 had a horrible bug when accessing PCI
> configuration space for PPC Mac architectures - instead of writing the PCI
> configuration data value to the data register address, it would instead write
> it to the data register address plus the PCI configuration address.
> 
> For this reason, the MemoryRegions for the PCI data register for
> grackle/uninorth were extremely large in order to accomodate the entire PCI
> configuration space being accessed during OpenBIOS PCI bus enumeration. Now
> that the OpenBIOS images have been updated, reduce the MemoryRegion sizes down
> to a single 32-bit register as intended.
> 
> Signed-off-by: Mark Cave-Ayland 
> CC: Hervé Poussineau 
> CC: Andreas Färber 
> CC: Alexander Graf 

With this patch applied, mac99 emulation seems to break:

  
http://award.ath.cx/results/288-alex/x86/kvm.qemu-git-tcg.ppc-debian.mac99-G4.etch.e1000.reboot/debug/serial-vm1.log


Alex




Re: [Qemu-devel] [PATCH v5 0/6] block: allow commit active as top

2013-11-07 Thread Fam Zheng

On 10/09/2013 01:19 PM, Fam Zheng wrote:

Previously live commit of active block device is not supported, this series
implements it and updates corresponding qemu-iotests cases.

This series is based on BlockJobType enum QAPI series.

v5: Address comments from Eric and Paolo:
 Add mirror_start_job and front end wrapper. [Paolo]
 Base on BlockJobType enum in QAPI. [Eric]
 Drop "common" sync mode. [Eric]

v4: Rewrite to reuse block/mirror.c.
 When committing the active layer, the job is internally a mirror job with
 type name faked to "commit".
 When the job completes, the BDSes are swapped, so the base image become
 active and [top, base) dropped.


Fam Zheng (6):
   mirror: don't close target
   mirror: move base to MirrorBlockJob
   block: add commit_active_start()
   commit: support commit active layer
   qemu-iotests: update test cases for commit active
   commit: remove unused check

  block/commit.c|  8 +
  block/mirror.c| 77 +++
  blockdev.c|  9 --
  include/block/block_int.h | 22 --
  qapi-schema.json  |  5 +--
  tests/qemu-iotests/040| 73 +++-
  6 files changed, 125 insertions(+), 69 deletions(-)



Ping?



Re: [Qemu-devel] [PATCH] qapi-schema: Update description for NewImageMode

2013-11-07 Thread Wenchao Xia
于 2013/11/8 2:47, Max Reitz 写道:
> If the NewImageMode is "absolute-paths" but no backing file is available
> (e.g., when mirroring a device with an unbacked image), the target image
> will not be backed either. This patch updates the documentation in
> qapi-schema.json accordingly.
> 
> Signed-off-by: Max Reitz 
> ---
> Follow-up to:
>   - block/drive-mirror: Check for NULL backing_hd
> ---
>   qapi-schema.json | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 81a375b..dde8e45 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -1736,7 +1736,8 @@
>   # @existing: QEMU should look for an existing image file.
>   #
>   # @absolute-paths: QEMU should create a new image with absolute paths
> -# for the backing file.
> +# for the backing file. If there is no backing file available, the new
> +# image will not be backed either.
>   #
>   # Since: 1.1
>   ##
> 
  It seems 2nd line should start with space, how about:
# @absolute-paths: QEMU should create a new image with absolute paths,
#  the new image would usually have a backing file. If
#  no backing file available, the new image will not be
#  backed either.




[Qemu-devel] [PATCH v2] target-ppc: move POWER7+ to a separate family

2013-11-07 Thread Alexey Kardashevskiy
So far POWER7+ was a part of POWER7 family. However it has a different
PVR base value so in order to support PVR masks, it needs a separate
family class.

Another reason to make a POWER7+ family is that its name in the device
tree (/proc/device-tree/cpus/cpu*) should be "Power7+" but not "Power7"
and this cannot be easily fixed without a new family class.

This adds a new family class, PVR base and mask values and moves
Power7+ v2.1 CPU to a new family. The class init function is copied
from the POWER7 family.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v2:
* added VSX enable bit
---
 target-ppc/cpu-models.c |  2 +-
 target-ppc/cpu-models.h |  2 ++
 target-ppc/translate_init.c | 38 ++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c
index 04d88c5..7c9466f 100644
--- a/target-ppc/cpu-models.c
+++ b/target-ppc/cpu-models.c
@@ -1140,7 +1140,7 @@
 "POWER7 v2.1")
 POWERPC_DEF("POWER7_v2.3",   CPU_POWERPC_POWER7_v23, POWER7,
 "POWER7 v2.3")
-POWERPC_DEF("POWER7+_v2.1",  CPU_POWERPC_POWER7P_v21,POWER7,
+POWERPC_DEF("POWER7+_v2.1",  CPU_POWERPC_POWER7P_v21,POWER7P,
 "POWER7+ v2.1")
 POWERPC_DEF("POWER8_v1.0",   CPU_POWERPC_POWER8_v10, POWER8,
 "POWER8 v1.0")
diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
index 731ec4a..49ba4a4 100644
--- a/target-ppc/cpu-models.h
+++ b/target-ppc/cpu-models.h
@@ -558,6 +558,8 @@ enum {
 CPU_POWERPC_POWER7_v20 = 0x003F0200,
 CPU_POWERPC_POWER7_v21 = 0x003F0201,
 CPU_POWERPC_POWER7_v23 = 0x003F0203,
+CPU_POWERPC_POWER7P_BASE   = 0x004A,
+CPU_POWERPC_POWER7P_MASK   = 0x,
 CPU_POWERPC_POWER7P_v21= 0x004A0201,
 CPU_POWERPC_POWER8_BASE= 0x004B,
 CPU_POWERPC_POWER8_MASK= 0x,
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 35d1389..c030a20 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7253,6 +7253,44 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 pcc->l1_icache_size = 0x8000;
 }
 
+POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
+
+dc->fw_name = "PowerPC,POWER7+";
+dc->desc = "POWER7+";
+pcc->pvr = CPU_POWERPC_POWER7P_BASE;
+pcc->pvr_mask = CPU_POWERPC_POWER7P_MASK;
+pcc->init_proc = init_proc_POWER7;
+pcc->check_pow = check_pow_nocheck;
+pcc->insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
+   PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
+   PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
+   PPC_FLOAT_STFIWX |
+   PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
+   PPC_MEM_SYNC | PPC_MEM_EIEIO |
+   PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
+   PPC_64B | PPC_ALTIVEC |
+   PPC_SEGMENT_64B | PPC_SLBI |
+   PPC_POPCNTB | PPC_POPCNTWD;
+pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX | PPC2_ISA205;
+pcc->msr_mask = 0x8204FF37ULL;
+pcc->mmu_model = POWERPC_MMU_2_06;
+#if defined(CONFIG_SOFTMMU)
+pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
+#endif
+pcc->excp_model = POWERPC_EXCP_POWER7;
+pcc->bus_model = PPC_FLAGS_INPUT_POWER7;
+pcc->bfd_mach = bfd_mach_ppc64;
+pcc->flags = POWERPC_FLAG_VRE | POWERPC_FLAG_SE |
+ POWERPC_FLAG_BE | POWERPC_FLAG_PMM |
+ POWERPC_FLAG_BUS_CLK | POWERPC_FLAG_CFAR |
+ POWERPC_FLAG_VSX;
+pcc->l1_dcache_size = 0x8000;
+pcc->l1_icache_size = 0x8000;
+}
+
 POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
-- 
1.8.4.rc4




Re: [Qemu-devel] i386: pc: align gpa<->hpa on 1GB boundary (v4)

2013-11-07 Thread Marcelo Tosatti
On Thu, Nov 07, 2013 at 04:24:59PM +0100, Igor Mammedov wrote:
> On Wed, 6 Nov 2013 19:31:19 -0200
> Marcelo Tosatti  wrote:
> 
> > 
> > v2: condition enablement of new mapping to new machine types (Paolo)
> > v3: fix changelog
> > v4: rebase
> > 
> > -
> > 
> > 
> > Align guest physical address and host physical address
> > beyond guest 4GB on a 1GB boundary.
> > 
> > Otherwise 1GB TLBs cannot be cached for the range.
> > 
> > Signed-off-by: Marcelo Tosatti 
> > 
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 12c436e..d29196d 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1156,7 +1156,7 @@ FWCfgState *pc_memory_init(MemoryRegion 
> > *system_memory,
> >  {
> >  int linux_boot, i;
> >  MemoryRegion *ram, *option_rom_mr;
> > -MemoryRegion *ram_below_4g, *ram_above_4g;
> > +MemoryRegion *ram_below_4g, *ram_above_4g, *ram_above_4g_piecetwo;
> >  FWCfgState *fw_cfg;
> >  
> >  linux_boot = (kernel_filename != NULL);
> > @@ -1177,10 +1177,45 @@ FWCfgState *pc_memory_init(MemoryRegion 
> > *system_memory,
> >  e820_add_entry(0, below_4g_mem_size, E820_RAM);
> >  if (above_4g_mem_size > 0) {
> >  ram_above_4g = g_malloc(sizeof(*ram_above_4g));
> > -memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
> > - below_4g_mem_size, above_4g_mem_size);
> > -memory_region_add_subregion(system_memory, 0x1ULL,
> > +/*
> > + *
> > + * If 1GB hugepages are used to back guest RAM, map guest address
> > + * space in the range [ramsize,ramsize+holesize] to the ram block
> > + * range [holestart, 4GB]
> > + *
> > + *  0  h 4G 
> > [ramsize,ramsize+holesize]
> > + *
> > + * guest-addr-space [  ] [  ][xxx]
> > + */--/
> > + * contiguous-ram-block [  ][xxx][ ]
> > + *
> > + * So that memory beyond 4GB is aligned on a 1GB boundary,
> > + * at the host physical address space.
> > + *
> > + */
> I did some corner cases testing and there is alias overlapping in case 
>   -m 4096 -mem-path /var/lib/hugetlbfs/global/pagesize-1GB
> 
>   0001-00011fff (prio 0, RW): alias ram-above-4g-piecetwo 
> @pc.ram e000-
>   0001-0001 (prio 0, RW): alias ram-above-4g @pc.ram 
> 0001-0001
> 
> perhaps zero sized ram-above-4g shouldn't be created at all?

Right.

> in addition ram-above-4g-piecetwo starts at half page offset e000 
> but guest sees it 4Gb offset,
> wouldn't it cause the same issue patch tries to solve but only for 
> ram-above-4g-piecetwo tail sync host/guest offsets
> are not 1Gb aligned?

Piece 1 is aligned. Piece 2 maps from tail of RAM (gpa) to start of hole
(ramblock).

> there is more misalignment with:
>  -m 4097 -mem-path /var/lib/hugetlbfs/global/pagesize-1GB
> 
>   0001-0001000f (prio 0, RW): alias ram-above-4g @pc.ram 
> 0001-0001000f

Piece 1 is aligned.

>   00010010-0001200f (prio 0, RW): alias ram-above-4g-piecetwo 
> @pc.ram e000-

Piece 2 is not. Should allocate one extra MB of RAM to align that. I'll
resend, thanks.

> where ram-above-4g-piecetwo is aligned with 1Gb+1Mb GPA offset, in addition 
> to 500Mb offset of HPA.
> which would cause the same issue as above prehaps?



[Qemu-devel] [PATCH] qdev-properties-system.c: Allow vlan or netdev for -device, not both

2013-11-07 Thread Vlad Yasevich
It is currently possible to specify things like:
-device e1000,netdev=foo,vlan=1
With this usage, whichever argument was specified last (vlan or netdev)
overwrites what was previousely set and results in a non-working
configuration.  Even worse, when used with multiqueue devices,
it causes a segmentation fault on exit in qemu_free_net_client.

That patch treates the above command line options as invalid and
generates an error at start-up.

Signed-off-by: Vlad Yasevich 
---
 hw/core/qdev-properties-system.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 0eada32..729efa8 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -205,6 +205,11 @@ static int parse_netdev(DeviceState *dev, const char *str, 
void **ptr)
 goto err;
 }
 
+if (ncs[i]) {
+ret = -EINVAL;
+goto err;
+}
+
 ncs[i] = peers[i];
 ncs[i]->queue_index = i;
 }
@@ -301,6 +306,10 @@ static void set_vlan(Object *obj, Visitor *v, void *opaque,
 *ptr = NULL;
 return;
 }
+if (*ptr) {
+error_set_from_qdev_prop_error(errp, -EINVAL, dev, prop, name);
+return;
+}
 
 hubport = net_hub_port_find(id);
 if (!hubport) {
-- 
1.8.3.1




Re: [Qemu-devel] [migration] questions about removing the old block-migration code

2013-11-07 Thread Zhanghaoyu (A)
>> I read below words on the report of > forecast (May 29, 2013)>, We were going to remove the old 
>> block-migration code Then people fixed it
>> Good: it works now
>> Bad: We have to maintain both
>> It uses the same port than migration
>> You need to migrate all/none of block devices
>> 
>> The old block-migration code said above is that in block-migration.c?
>
>Yes.
>
>> What are the reasons of removing the old block-migration code? Buggy 
>> implementation? Or need to migrate all/none of block devices?
>
>Buggy and tightly coupled with the live migration code, making it hard to 
>modify either area independently.

Thanks a lot for explaining.
Till now, we still use the old block-migration code in our virtualization 
solution.
Could you detail the bugs that the old block-migration code have?

Thanks,
Zhang Haoyu

>
>> What's the substitutional method? drive_mirror?
>
>drive_mirror over NBD is an alternative.  There are security and integration 
>challenges with those approaches but libvirt has added drive-mirror block 
>migration support.
>
>Stefan



Re: [Qemu-devel] [PULL 17/43] loader: support for unmapped ROM blobs

2013-11-07 Thread Alexander Graf

On 14.10.2013, at 16:58, Michael S. Tsirkin  wrote:

> Support ROM blobs not mapped into guest memory:
> same as ROM files really but use caller's buffer.
> 
> Support invoking callback on access and
> return memory pointer making it easier
> for caller to update memory if necessary.
> 
> Reviewed-by: Gerd Hoffmann 
> Tested-by: Gerd Hoffmann 
> Reviewed-by: Laszlo Ersek 
> Reviewed-by: Igor Mammedov 
> Tested-by: Igor Mammedov 
> Signed-off-by: Michael S. Tsirkin 

This patch breaks -M bamboo. Let explain inline why.

> ---
> hw/lm32/lm32_hwsetup.h |  2 +-
> include/hw/loader.h|  7 ---
> hw/core/loader.c   | 23 ---
> 3 files changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/lm32/lm32_hwsetup.h b/hw/lm32/lm32_hwsetup.h
> index 3449bd8..9fd5e69 100644
> --- a/hw/lm32/lm32_hwsetup.h
> +++ b/hw/lm32/lm32_hwsetup.h
> @@ -73,7 +73,7 @@ static inline void hwsetup_free(HWSetup *hw)
> static inline void hwsetup_create_rom(HWSetup *hw,
> hwaddr base)
> {
> -rom_add_blob("hwsetup", hw->data, TARGET_PAGE_SIZE, base);
> +rom_add_blob("hwsetup", hw->data, TARGET_PAGE_SIZE, base, NULL, NULL, 
> NULL);
> }
> 
> static inline void hwsetup_add_u8(HWSetup *hw, uint8_t u)
> diff --git a/include/hw/loader.h b/include/hw/loader.h
> index 6145736..e0c576b 100644
> --- a/include/hw/loader.h
> +++ b/include/hw/loader.h
> @@ -40,8 +40,9 @@ extern bool rom_file_in_ram;
> 
> int rom_add_file(const char *file, const char *fw_dir,
>  hwaddr addr, int32_t bootindex);
> -int rom_add_blob(const char *name, const void *blob, size_t len,
> - hwaddr addr);
> +void *rom_add_blob(const char *name, const void *blob, size_t len,
> +   hwaddr addr, const char *fw_file_name,
> +   FWCfgReadCallback fw_callback, void *callback_opaque);
> int rom_add_elf_program(const char *name, void *data, size_t datasize,
> size_t romsize, hwaddr addr);
> int rom_load_all(void);
> @@ -53,7 +54,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict);
> #define rom_add_file_fixed(_f, _a, _i)  \
> rom_add_file(_f, NULL, _a, _i)
> #define rom_add_blob_fixed(_f, _b, _l, _a)  \
> -rom_add_blob(_f, _b, _l, _a)
> +(rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL) ? 0 : -1)

^

> 
> #define PC_ROM_MIN_VGA 0xc
> #define PC_ROM_MIN_OPTION  0xc8000
> diff --git a/hw/core/loader.c b/hw/core/loader.c
> index 7b3d3ee..449bd4c 100644
> --- a/hw/core/loader.c
> +++ b/hw/core/loader.c
> @@ -700,10 +700,12 @@ err:
> return -1;
> }
> 
> -int rom_add_blob(const char *name, const void *blob, size_t len,
> - hwaddr addr)
> +void *rom_add_blob(const char *name, const void *blob, size_t len,
> +   hwaddr addr, const char *fw_file_name,
> +   FWCfgReadCallback fw_callback, void *callback_opaque)
> {
> Rom *rom;
> +void *data = NULL;

data = NULL

> 
> rom   = g_malloc0(sizeof(*rom));
> rom->name = g_strdup(name);
> @@ -713,7 +715,22 @@ int rom_add_blob(const char *name, const void *blob, 
> size_t len,
> rom->data = g_malloc0(rom->datasize);
> memcpy(rom->data, blob, len);
> rom_insert(rom);
> -return 0;
> +if (fw_file_name && fw_cfg) {

We never execute this for rom_add_blob_fixed(), so ...

> +char devpath[100];
> +
> +snprintf(devpath, sizeof(devpath), "/rom@%s", fw_file_name);
> +
> +if (rom_file_in_ram) {
> +data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
> +} else {
> +data = rom->data;
> +}
> +
> +fw_cfg_add_file_callback(fw_cfg, fw_file_name,
> + fw_callback, callback_opaque,
> + data, rom->romsize);
> +}
> +return data;

... we always return NULL here.

That means that rom_add_blob_fixed() always returns -1. If rom_add_blob_fixed() 
returns != 0, we abort loading the device tree as blob during bamboo machine 
init.


Alex




Re: [Qemu-devel] [PATCH 00/14] VSX Stage 4

2013-11-07 Thread Richard Henderson
On 11/07/2013 06:31 AM, Tom Musta wrote:
> The single-precision scalar arithmetic instructions all interpret the most
> significant 64 bits of a VSR as a single precision floating point number
> stored in double precision format (similar to the standard PowerPC floating
> point single precision instructions).  Thus a common theme in the supporting
> code is rounding of an intermediate double-precision number to single 
> precision.

Modulo my comments wrt the actual computation of fma, the patches all look fine.

But I'd like to also mention a pre-existing flaw/niggle in the ppc port.

The conversions to/from in-register representation for the single-precision
values should never raise exceptions.  Yet we always use

d.d = float32_to_float64(f.f, &env->fp_status);
f.f = float64_to_float32(d.d, &env->fp_status);

The use of env->fp_status is either wrong or extremely misleading.  It sure
looks like the operation affects global cpu state.  It may be that that state
is never copied back to the "real" fpscr and so doesn't actually affect cpu
state, but how can I see that for sure?

I think it would be better to implement ConvertSPtoDP_NP and ConvertSP64toSP
exactly as written in the spec.

Or at minimum use a dummy fp_status that's not associated with env.  It should
not matter what the "real" rounding mode is in either case, since values that
are not exactly representable as single-precision values give undefined results.


r~



Re: [Qemu-devel] [PATCH 12/14] VSX Stage 4: Add Scalar SP Fused Multiply-Adds

2013-11-07 Thread Richard Henderson
On 11/08/2013 09:30 AM, Richard Henderson wrote:
> On 11/08/2013 09:28 AM, Richard Henderson wrote:
>> On 11/07/2013 06:31 AM, Tom Musta wrote:
>>>  }  
>>>\
>>> +   
>>>\
>>> +if (r2sp) {
>>>\
>>> +float32 tmp32 = float64_to_float32(xt_out.fld[i],  
>>>\
>>> +   &env->fp_status);   
>>>\
>>> +xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status);
>>>\
>>> +}  
>>>\
>>> +   
>>>\
>>
>> You can't get correct results for a single-precision fma from a
>> double-precision fma and merely rounding the results.
>>
>> See e.g. glibc's sysdeps/ieee754/dbl-64/s_fmaf.c.
> 
> Blah, nevermind.  That would be using separate add+mul in double-precision, 
> not
> using a double-precision fma primitive.

Hmph.  I was right the first time.  See

> http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/

for example inputs that suffer from double-rounding.

What's needed in each of the examples are infinite precision values containing
55 bits.  This is easy to accomplish with fma.

Two 23-bit inputs can create a product with 46 significant bits.  One can
append 23 more significant bits by choosing an exponent for the addend that
does not overlap the product.  Thus one can create (almost) every intermediate
result with up to 69 consecutive bits (the exception being products without
factors that can be represented in 23-bits).

I'm too lazy to decompose the examples therein to actual single-precision
inputs, but I'm certain it can be done.

Thus you *do* need the round-to-zero plus inexact solution that glibc uses.
(Or to perform the fma in 128-bits and round once, but I think that would be
way more intrusive wrt the rest of the code, and more expensive than necessary.)


r~



Re: [Qemu-devel] [PATCH 12/14] VSX Stage 4: Add Scalar SP Fused Multiply-Adds

2013-11-07 Thread Richard Henderson
On 11/08/2013 09:28 AM, Richard Henderson wrote:
> On 11/07/2013 06:31 AM, Tom Musta wrote:
>>  }   
>>   \
>> +
>>   \
>> +if (r2sp) { 
>>   \
>> +float32 tmp32 = float64_to_float32(xt_out.fld[i],   
>>   \
>> +   &env->fp_status);
>>   \
>> +xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status); 
>>   \
>> +}   
>>   \
>> +
>>   \
> 
> You can't get correct results for a single-precision fma from a
> double-precision fma and merely rounding the results.
> 
> See e.g. glibc's sysdeps/ieee754/dbl-64/s_fmaf.c.

Blah, nevermind.  That would be using separate add+mul in double-precision, not
using a double-precision fma primitive.


r~




Re: [Qemu-devel] [PATCH 12/14] VSX Stage 4: Add Scalar SP Fused Multiply-Adds

2013-11-07 Thread Richard Henderson
On 11/07/2013 06:31 AM, Tom Musta wrote:
>  }
>  \
> + 
>  \
> +if (r2sp) {  
>  \
> +float32 tmp32 = float64_to_float32(xt_out.fld[i],
>  \
> +   &env->fp_status); 
>  \
> +xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status);  
>  \
> +}
>  \
> + 
>  \

You can't get correct results for a single-precision fma from a
double-precision fma and merely rounding the results.

See e.g. glibc's sysdeps/ieee754/dbl-64/s_fmaf.c.


r~



Re: [Qemu-devel] [PATCH v2] e1000: initial link negotiation on mac osx

2013-11-07 Thread Alexander Graf


Am 07.11.2013 um 21:28 schrieb "Gabriel L. Somlo" :

> Some guest operating systems' drivers (particularly Mac OS X)
> expect the link state to be pre-initialized by an earlier
> component such as a proprietary BIOS. This patch injects
> additional LSC events upon PHY reset, allowing the OS X driver
> to successfully complete initial link negotiation. This is a
> follow-up to commit 372254c6e5c078fb13b236bb648d2b9b2b0c70f1,
> which works around the OS X driver's failure to properly set
> up the MAC address.
> 
> Signed-off-by: Gabriel Somlo 
> ---
> 
> On Thu, Nov 07, 2013 at 08:28:47PM +0100, Paolo Bonzini wrote:
>> Is there any way to work around this in the guest?  Such as using a
>> UEFI driver for e1000 or something like that.
> 
> Currently OS X boots on top of SeaBIOS and Chameleon, neither of which
> know anything about the e1000 hardware. On real hardware, the XNU e1000
> driver expects the proprietary BIOS to set things up "just right", and
> doesn't have to bother jumping through all the hoops to properly
> initialize the hardware from scratch (after all, the XNU driver
> developers only have to care about a limited range of carefully
> controlled hardware).
> 
> In the VM/guest scenario, QEMU is the only piece that has any knowledge
> of the e1000 hardware, so having it prep things for the guest would be
> the path of least resistance. Using a completely different alternative
> to SeaBIOS (one that would/could assume e1000 is present and would know
> enough about it to configure it just right) sounds a lot less feasible.

I'm not sure I agree. We can easily modify SeaBIOS to just loop through all PCI 
devices, look for an e1000 and initialize it far enough for XNU, no?

After all, it sounds like that's closer to the way a real Mac works.

Alex

> 
> Oh, and it gets worse :) My v1 patch does enough cheating to get OS X
> to link up successfully on a e1000-equipped *Q35* VM. If Q35 is not
> specified, then another bit of trickery is needed (i.e. inject another
> LSC when the driver initially unmasks LSC in the IMS register...
> 
> Anyhow, this second version seems to work both with Q35 and without,
> on my vanilla SnowLeopard image.
> 
> Any more thoughts and ideas much appreciated!
> 
> Thanks,
>  Gabriel
> 
> hw/net/e1000.c | 20 +++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index ae63591..fe0f34e 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -186,6 +186,9 @@ e1000_link_up(E1000State *s)
> s->phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS;
> }
> 
> +/* Forward decl. for use in set_phy_ctrl() (OS X link nego. workaround) */
> +static void set_ics(E1000State *s, int index, uint32_t val);
> +
> static void
> set_phy_ctrl(E1000State *s, int index, uint16_t val)
> {
> @@ -197,6 +200,15 @@ set_phy_ctrl(E1000State *s, int index, uint16_t val)
> if (!(s->compat_flags & E1000_FLAG_AUTONEG)) {
> return;
> }
> +/*
> + * The Mac OS X driver expects a pre-initialized network card; injecting
> + * an extra LSC event here allows initial link negotiation to succeed in
> + * the absence of the Apple EFI BIOS.
> + */
> +if ((val & MII_CR_RESET)) {
> +set_ics(s, 0, E1000_ICR_LSC);
> +return;
> +}
> if ((val & MII_CR_AUTO_NEG_EN) && (val & MII_CR_RESTART_AUTO_NEG)) {
> e1000_link_down(s);
> s->phy_reg[PHY_STATUS] &= ~MII_SR_AUTONEG_COMPLETE;
> @@ -1159,8 +1171,14 @@ set_imc(E1000State *s, int index, uint32_t val)
> static void
> set_ims(E1000State *s, int index, uint32_t val)
> {
> +uint32_t ics_val = 0;
> +
> +/* When Mac OS X initially unmasks LSC, it expects to see it set in ICS 
> */
> +if (s->mac_reg[IMS] == 0 && (val & E1000_IMS_LSC))
> +ics_val |= E1000_ICR_LSC;
> +
> s->mac_reg[IMS] |= val;
> -set_ics(s, 0, 0);
> +set_ics(s, 0, ics_val);
> }
> 
> #define getreg(x)[x] = mac_readreg
> -- 
> 1.8.1.4
> 



[Qemu-devel] [PATCH 2/2] i386/pc_piix: the pci-hole should end where the system flash starts

2013-11-07 Thread Laszlo Ersek
Signed-off-by: Laszlo Ersek 
---
 hw/i386/pc_piix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 6e2c027..9bda20a 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -152,7 +152,7 @@ static void pc_init1(QEMUMachineInitArgs *args,
 pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
   system_memory, system_io, args->ram_size,
   below_4g_mem_size,
-  0x1ULL - below_4g_mem_size,
+  (0x1ULL - flash_size) - 
below_4g_mem_size,
   above_4g_mem_size,
   pci_memory, ram_memory);
 } else {
-- 
1.8.3.1




[Qemu-devel] [PATCH 0/2] Re: exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Laszlo Ersek
On 11/07/13 22:24, Marcel Apfelbaum wrote:

> Why pci-hole and system.flash collide? IMHO we should not play with
> priorities here, better solve the collision.

What about this "beautiful" series? It produces

memory
-000f (prio 0, RW): system
  [...]
  6000-ffdf (prio 0, RW): alias pci-hole @pci
  6000-ffdf
  [...]
  ffe0- (prio 0, R-): system.flash

and I can run OVMF with it. It also stays within i386/pc.

Re 2/2, note that "below_4g_mem_size" never exceeds 0xe000 in
pc_init1(), so the subtraction is safe.

Thanks
Laszlo

Laszlo Ersek (2):
  i386/pc: propagate flash size from pc_system_flash_init() to
pc_init1()
  i386/pc_piix: the pci-hole should end where the system flash starts

 include/hw/i386/pc.h |  6 --
 hw/i386/pc.c |  5 +++--
 hw/i386/pc_piix.c|  5 +++--
 hw/i386/pc_q35.c |  3 ++-
 hw/i386/pc_sysfw.c   | 10 +++---
 5 files changed, 19 insertions(+), 10 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH 1/2] i386/pc: propagate flash size from pc_system_flash_init() to pc_init1()

2013-11-07 Thread Laszlo Ersek
... upwards through the following call chain:

  pc_init1() | pc_q35_init()
pc_memory_init()
  pc_system_firmware_init()
pc_system_flash_init()

Signed-off-by: Laszlo Ersek 
---
 include/hw/i386/pc.h |  6 --
 hw/i386/pc.c |  5 +++--
 hw/i386/pc_piix.c|  3 ++-
 hw/i386/pc_q35.c |  3 ++-
 hw/i386/pc_sysfw.c   | 10 +++---
 5 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 03cc0ba..a9b938e 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -148,7 +148,8 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
ram_addr_t above_4g_mem_size,
MemoryRegion *rom_memory,
MemoryRegion **ram_memory,
-   PcGuestInfo *guest_info);
+   PcGuestInfo *guest_info,
+   int64_t **flash_size);
 qemu_irq *pc_allocate_cpu_irq(void);
 DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus);
 void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
@@ -232,7 +233,8 @@ static inline bool isa_ne2000_init(ISABus *bus, int base, 
int irq, NICInfo *nd)
 
 /* pc_sysfw.c */
 void pc_system_firmware_init(MemoryRegion *rom_memory,
- bool isapc_ram_fw);
+ bool isapc_ram_fw,
+ int64_t **flash_size);
 
 /* pvpanic.c */
 void pvpanic_init(ISABus *bus);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 12c436e..3ec18aa 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1152,7 +1152,8 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
ram_addr_t above_4g_mem_size,
MemoryRegion *rom_memory,
MemoryRegion **ram_memory,
-   PcGuestInfo *guest_info)
+   PcGuestInfo *guest_info,
+   int64_t **flash_size)
 {
 int linux_boot, i;
 MemoryRegion *ram, *option_rom_mr;
@@ -1186,7 +1187,7 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
 
 
 /* Initialize PC system firmware */
-pc_system_firmware_init(rom_memory, guest_info->isapc_ram_fw);
+pc_system_firmware_init(rom_memory, guest_info->isapc_ram_fw, flash_size);
 
 option_rom_mr = g_malloc(sizeof(*option_rom_mr));
 memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE);
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 4fdb7b6..6e2c027 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -89,6 +89,7 @@ static void pc_init1(QEMUMachineInitArgs *args,
 DeviceState *icc_bridge;
 FWCfgState *fw_cfg = NULL;
 PcGuestInfo *guest_info;
+int64_t flash_size = 0;
 
 if (xen_enabled() && xen_hvm_init(&ram_memory) != 0) {
 fprintf(stderr, "xen hardware virtual machine initialisation 
failed\n");
@@ -135,7 +136,7 @@ static void pc_init1(QEMUMachineInitArgs *args,
args->kernel_filename, args->kernel_cmdline,
args->initrd_filename,
below_4g_mem_size, above_4g_mem_size,
-   rom_memory, &ram_memory, guest_info);
+   rom_memory, &ram_memory, guest_info, &flash_size);
 }
 
 gsi_state = g_malloc0(sizeof(*gsi_state));
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 4c191d3..90f29e9 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -76,6 +76,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
 PCIDevice *ahci;
 DeviceState *icc_bridge;
 PcGuestInfo *guest_info;
+int64_t flash_size = 0;
 
 if (xen_enabled() && xen_hvm_init(&ram_memory) != 0) {
 fprintf(stderr, "xen hardware virtual machine initialisation 
failed\n");
@@ -120,7 +121,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
args->kernel_filename, args->kernel_cmdline,
args->initrd_filename,
below_4g_mem_size, above_4g_mem_size,
-   rom_memory, &ram_memory, guest_info);
+   rom_memory, &ram_memory, guest_info, &flash_size);
 }
 
 /* irq lines */
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index e917c83..0d05dec 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -73,7 +73,8 @@ static void pc_isa_bios_init(MemoryRegion *rom_memory,
 }
 
 static void pc_system_flash_init(MemoryRegion *rom_memory,
- DriveInfo *pflash_drv)
+ DriveInfo *pflash_drv,
+ int64_t **flash_size)
 {
 BlockDriverState *bdrv;
 int64_t size;
@@ -101,6 +102,7 @@ static void pc_system_flash_init(MemoryRegion *rom_memory,
 flash_mem = pflash_cfi01_get_memory(system_flash);
 
 pc_isa_bios_init(rom_memory, flash_mem, size);
+*flash_size = size;
 }
 
 static void old_pc_system_rom_init(MemoryRegion 

Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Marcel Apfelbaum
On Thu, 2013-11-07 at 22:31 +0100, Paolo Bonzini wrote:
> Il 07/11/2013 22:24, Marcel Apfelbaum ha scritto:
> > Thank you Laszlo for the detailed info!
> > I think the problem is right above. Why pci-hole and system.flash collide?
> > IMHO we should not play with priorities here, better solve the collision.
> 
> We need to audit all the other boards that support PCI...  I'll take a
> look tomorrow since you guys are off.
Thanks Paolo,
Let me just point out what I know (or I think I know):
1. Not all architectures have the behavior: "Address space that is not RAM(and 
friends) is for sure PCI".
   Only x86 behaves like this (I think). That means that you cannot have a 
64bit wide pci-hole
   with lower priority that catches all accesses that are not for RAM(and 
firends).
2. If the above is right, and making pci-hole 64 bit wide is not an option,
   playing with pci-holes/other-region priorities it would be just wrong,
   it would be only to "fight" with the locality of the memory region's 
priority.

That being said, I am looking forward for your findings.
Thanks!
Marcel
  
> 
> Paolo






Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Marcel Apfelbaum
On Thu, 2013-11-07 at 22:48 +0100, Laszlo Ersek wrote:
> On 11/07/13 22:24, Marcel Apfelbaum wrote:
> > On Thu, 2013-11-07 at 22:12 +0100, Laszlo Ersek wrote:
> 
> >>   adding subregion 'pci-hole' to region 'system' at offset 6000
> >>   warning: subregion collision 6000/a000 (pci-hole) vs 
> >> ffe0/20 (system.flash)
> > Thank you Laszlo for the detailed info!
> > I think the problem is right above. Why pci-hole and system.flash collide?
> > IMHO we should not play with priorities here, better solve the collision.
> 
> pc_init1()
>   pc_memory_init()
> pc_system_firmware_init()
>   pc_system_flash_init() < sets base address to
>0x1ULL - flash_size
> pflash_cfi01_register()
>   sysbus_mmio_map()
> sysbus_mmio_map_common()
>   memory_region_add_subregion()
>   i440fx_init()
> memory_region_init_alias("pci-hole")
> 
> pc_init1() passes
> 
> 0x1ULL - below_4g_mem_size
> 
> to i440fx_init() as "pci_hole_size", which is then used as the size of
> the "pci-hole" alias.
> 
> We should probably subtract the size of the flash from this, but I don't
> know how to do that "elegantly". Yet another (output) parameter for
> pc_memory_init()? Blech.
> 
> Or look up the end address of "system.flash" by name?
Both seems ugly to me...
As Peter Maydell pointed out, the pci-hole belongs to the pc model,
so we can actually change its priority without affecting other
architectures.
 
> 
> Thanks
> Laszlo
> 






Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Marcel Apfelbaum
On Thu, 2013-11-07 at 21:51 +, Peter Maydell wrote:
> On 7 November 2013 21:38, Marcel Apfelbaum  wrote:
> > Thanks Paolo,
> > Let me just point out what I know (or I think I know):
> > 1. Not all architectures have the behavior: "Address space that is not 
> > RAM(and friends) is for sure PCI".
> >Only x86 behaves like this (I think).
> 
> More specifically, the x86 pc behaves like this. Other
> x86 based systems could in theory behave differently
> (not that we actually model any, I think).
> 
> > That means that you cannot have a 64bit wide pci-hole
> >with lower priority that catches all accesses that are not for RAM(and 
> > firends).
> 
> ...but this conclusion is wrong, because the pci-hole
> region is created by the pc model. So we should create
Yes... It think you are right. I was thinking for some
reason of master-abort region belonging to the the pci bus
that is not specific to x86 pc...

> it at the correct priority and size to give the behaviour
> relative to other devices in the pc model that is required
> (ie that the hardware has). This doesn't affect any other
> target architecture or board.
> 
> That said, I don't know enough about the PC to know what
> the exact details of the pci-hole are, so I'm not making a
> statement about what the correct model is. I'm just saying
> that what you do with the pci-hole and the container it lives
> in and the other devices in that container is not going to
> change the behaviour of any other target board.
> 
> > 2. If the above is right, and making pci-hole 64 bit wide is not an option,
> >playing with pci-holes/other-region priorities it would be just wrong,
> >it would be only to "fight" with the locality of the memory region's 
> > priority.
> 
> I have no idea what you mean by "fighting" here. MR priorities
By "fighting" I was referring exactly to the fact that because 
priorities are container specific we need to use them twice to
get the wanted behavior (master abort being with the lowest priority)
1. master-abort region priority for pci-address-space
2. pci-hole priority for system-address-space
But this is as designed...

Thanks,
Marcel

> apply only within a specific container region[*], to set which of
> that container's children appears 'above' another. They're
> totally local to the container (which in this case is part of the
> PC model, not the generic PCI code) and so the PC model
> can freely set them to whatever makes most sense.
> 
> [*] if you didn't already know this, see the recently committed
> updates to doc/memory.txt for a more detailed explanation.
> 
> -- PMM
> 






Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Peter Maydell
On 7 November 2013 21:38, Marcel Apfelbaum  wrote:
> Thanks Paolo,
> Let me just point out what I know (or I think I know):
> 1. Not all architectures have the behavior: "Address space that is not 
> RAM(and friends) is for sure PCI".
>Only x86 behaves like this (I think).

More specifically, the x86 pc behaves like this. Other
x86 based systems could in theory behave differently
(not that we actually model any, I think).

> That means that you cannot have a 64bit wide pci-hole
>with lower priority that catches all accesses that are not for RAM(and 
> firends).

...but this conclusion is wrong, because the pci-hole
region is created by the pc model. So we should create
it at the correct priority and size to give the behaviour
relative to other devices in the pc model that is required
(ie that the hardware has). This doesn't affect any other
target architecture or board.

That said, I don't know enough about the PC to know what
the exact details of the pci-hole are, so I'm not making a
statement about what the correct model is. I'm just saying
that what you do with the pci-hole and the container it lives
in and the other devices in that container is not going to
change the behaviour of any other target board.

> 2. If the above is right, and making pci-hole 64 bit wide is not an option,
>playing with pci-holes/other-region priorities it would be just wrong,
>it would be only to "fight" with the locality of the memory region's 
> priority.

I have no idea what you mean by "fighting" here. MR priorities
apply only within a specific container region[*], to set which of
that container's children appears 'above' another. They're
totally local to the container (which in this case is part of the
PC model, not the generic PCI code) and so the PC model
can freely set them to whatever makes most sense.

[*] if you didn't already know this, see the recently committed
updates to doc/memory.txt for a more detailed explanation.

-- PMM



Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Laszlo Ersek
On 11/07/13 22:24, Marcel Apfelbaum wrote:
> On Thu, 2013-11-07 at 22:12 +0100, Laszlo Ersek wrote:

>>   adding subregion 'pci-hole' to region 'system' at offset 6000
>>   warning: subregion collision 6000/a000 (pci-hole) vs 
>> ffe0/20 (system.flash)
> Thank you Laszlo for the detailed info!
> I think the problem is right above. Why pci-hole and system.flash collide?
> IMHO we should not play with priorities here, better solve the collision.

pc_init1()
  pc_memory_init()
pc_system_firmware_init()
  pc_system_flash_init() < sets base address to
   0x1ULL - flash_size
pflash_cfi01_register()
  sysbus_mmio_map()
sysbus_mmio_map_common()
  memory_region_add_subregion()
  i440fx_init()
memory_region_init_alias("pci-hole")

pc_init1() passes

0x1ULL - below_4g_mem_size

to i440fx_init() as "pci_hole_size", which is then used as the size of
the "pci-hole" alias.

We should probably subtract the size of the flash from this, but I don't
know how to do that "elegantly". Yet another (output) parameter for
pc_memory_init()? Blech.

Or look up the end address of "system.flash" by name?

Thanks
Laszlo



[Qemu-devel] [Bug 886408] Re: Windows 64 bits install BSOD : UNSUPPORTED_PROCESSOR (only without kvm)

2013-11-07 Thread Stefan Weil
*** This bug is a duplicate of bug 921208 ***
https://bugs.launchpad.net/bugs/921208

** This bug has been marked a duplicate of bug 921208
   win7/x64 installer hangs on startup with 0x005d.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/886408

Title:
  Windows 64 bits install BSOD : UNSUPPORTED_PROCESSOR (only without
  kvm)

Status in QEMU:
  New

Bug description:
  Windows DVD starts loading and then stops with error 0x5d
  MSDN tells us that 0x5d means unsupported_processor

  The guest CPU is 64 bits though, the DVD fails early if booted on a
  32bits CPU

  Tested versions: 0.14.1, git (932eacc158)

  qemu-system-x86_64 -m 1024 -boot d -cdrom win7.iso -hda win7.hdd

  Tried various -cpu options: Nehalem, core2duo, Penryn, Conroe,
  Opteron_G3

  x86_64 host, without vx-t

  Same iso is reported to install fine with kvm

  Other similar reports:
  http://virtuallyfun.superglobalmegacorp.com/?p=386
  http://comments.gmane.org/gmane.comp.emulators.qemu/92457
  
http://xen.1045712.n5.nabble.com/64bit-Windows-Guest-Blue-Screens-on-install-td4505715.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/886408/+subscriptions



Re: [Qemu-devel] [RFC PATCH v1: 11/12] mc: register MC qemu-file functions and expose MC tunable capability

2013-11-07 Thread Isaku Yamahata
On Wed, Nov 06, 2013 at 11:34:25AM -0500,
"Michael R. Hines"  wrote:

> On 10/23/2013 07:00 AM, Isaku Yamahata wrote:
> >Since more integer parameters would come in the future, so how about
> >set_migrate_parameter similar to set_migrate_capability?
> >It sets integer value, while set_migrate_capability sets bool value.
>
> That's a very good idea, I think - but we might get some pushback
> from the list.

Maybe. Actually we already have three configurable parameters.
bandwith_limit, xbzrle_cache_size, max_downtime.
They can be all consolidated.


> There is a hesitation to add such low-level parameters, but for
> something like
> micro-checkpointing that may potentially have large impacts on
> application performance,
> I do think it would be critical to expose more customizability
> like 'set_migrate_parameter' to management software.
> 
> In fact, this separation of function may be "better" than continuing
> to expand
> the list of individual commands in the QEMU monitor.
> 
> Could you submit a patch for the framework of such a command?

Sure.
-- 
Isaku Yamahata 



[Qemu-devel] [Bug 1248959] [NEW] pdpe1gb flag is missing in guest running on Intel h/w

2013-11-07 Thread Vladimir Zinovjevs
Public bug reported:

I need to utilize 1G hugepages on my guest system. But this is not
possible as long as there is no pdpe1gb support in guest system.  The
latest source code contains pdpe1gb  support for AMD but not for Intel.

Are there any obstacles that does not allow to implement it for modern
Intel chips?

My configuration:
Host:
---
uname -a
Linux tripel.salab.cic.nsn-rdnet.net 2.6.32-358.14.1.el6.x86_64 #1 SMP Tue Jul 
16 23:51:20 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

cat /etc/*-release
CentOS release 6.4 (Final)

yum list installed | grep qemu
gpxe-roms-qemu.noarch0.9.7-6.9.el6 @base
qemu-img.x86_64  2:0.12.1.2-2.355.0.1.el6.centos.5
qemu-kvm.x86_64  2:0.12.1.2-2.355.0.1.el6.centos.5

cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 45
model name  : Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz
stepping: 7
cpu MHz : 2700.000
cache size  : 20480 KB
physical id : 0
siblings: 16
core id : 0
cpu cores   : 8
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb 
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc 
aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr 
pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx 
lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips: 5387.09
clflush size: 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual

/usr/libexec/qemu-kvm -cpu ?
Recognized CPUID flags:
  f_edx: pbe ia64 tm ht ss sse2 sse fxsr mmx acpi ds clflush pn pse36 pat cmov 
mca pge mtrr sep apic cx8 mce pae msr tsc pse de vme fpu
  f_ecx: hypervisor rdrand f16c avx osxsave xsave aes tsc-deadline popcnt movbe 
x2apic sse4.2|sse4_2 sse4.1|sse4_1 dca pcid pdcm xtpr cx16 fma cid ssse3 tm2 
est smx vmx ds_cpl monitor dtes64 pclmulqdq|pclmuldq pni|sse3
  extf_edx: 3dnow 3dnowext lm|i64 rdtscp pdpe1gb fxsr_opt|ffxsr fxsr mmx mmxext 
nx|xd pse36 pat cmov mca pge mtrr syscall apic cx8 mce pae msr tsc pse de vme 
fpu
  extf_ecx: perfctr_nb perfctr_core topoext tbm nodeid_msr tce fma4 lwp wdt 
skinit xop ibs osvw 3dnowprefetch misalignsse sse4a abm cr8legacy extapic svm 
cmp_legacy lahf_lm

ps ax | grep qemu
 7197 ?Sl 0:15 /usr/libexec/qemu-kvm -name vladimir.AS-0 -S -M 
rhel6.4.0 -cpu 
SandyBridge,+pdpe1gb,+osxsave,+dca,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
 -enable-kvm -m 8192 -mem-prealloc -mem-path 
/var/lib/hugetlbfs/pagesize-1GB/libvirt/qemu -smp 4,sockets=4,cores=1,threads=1 
-uuid ec2d3c58-a7f0-fdbd-9de5-b547a5b3130f -nographic -nodefconfig -nodefaults 
-chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/vladimir.AS-0.monitor,server,nowait
 -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown 
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -netdev 
tap,fd=28,id=hostnet0 -device 
e1000,netdev=hostnet0,id=net0,mac=52:54:00:81:5b:df,bus=pci.0,addr=0x3,bootindex=1
 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 
-device pci-assign,host=02:00.0,id=hostdev0,configfd=29,bus=pci.0,addr=0x4 
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

Guest:
-
# uname -a
Linux AS-0 2.6.34.13-WR4.3.fp_x86_64_standard-00019-g052bb3e #1 SMP Wed May 8 
12:21:02 EEST 2013 x86_64 x86_64 x86_64 GNU/Linux

#  cat /etc/*-release
Wind River Linux 4.3 glibc_cgl

# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 42
model name  : Intel Xeon E312xx (Sandy Bridge)
stepping: 1
cpu MHz : 2693.893
cache size  : 4096 KB
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat 
pse36 clflush mmx fxsr sse sse2 ss syscall nx lm constant_tsc rep_good pni 
pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor 
lahf_lm xsaveopt
bogomips: 5387.78
clflush size: 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1248959

Title:
  pdpe1gb flag is missing in guest running on Intel h/w

Status in QEMU:
  New

Bug description:
  I need to utilize 1G hugepages on my guest system. But this is not
  possible as long as there is no pdpe1gb support in guest system.  The
  latest source code contains pdpe1gb  supp

[Qemu-devel] [Bug 1248427] Re: whether qemu-img convert will support "-s" option in the newer version

2013-11-07 Thread Yang Yu
Looks like we need to adjust OpenStack code for the change in RHEL6.5. I
see there is a bug reported here
https://bugs.launchpad.net/nova/+bug/1242819 to clear up some
unnecessary steps for snapshot.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1248427

Title:
  whether qemu-img convert will support "-s" option in the newer version

Status in QEMU:
  Incomplete

Bug description:
  In ubuntu os, my qemu-img version is 0.12.0 and its help message show that 
"qemu-img convert" command supports "-s" option, but it disappeared in qemu-img 
0.12.1. However, though it disappeared, it does support in RHEL6.4, and the rpm 
version is qemu-img-0.12.1.2-2.355.el6.x86_64. But in RHEL6.5Beta, the same 
version 0.12.1, if use "qemu-img convert -s", it reports a not support '-s' 
option error message. The rpm version is qemu-img-0.12.1.2-2.402.el6.x86_64. 
  And because of this issue, the openstack "nova image-create" will raise an 
error. Because it uses this command with "-s" option.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1248427/+subscriptions



Re: [Qemu-devel] [PATCH] e1000/rtl8139: update HMP NIC when every bit is written

2013-11-07 Thread Vlad Yasevich

On 11/07/2013 09:33 AM, Alex Williamson wrote:

On Thu, 2013-11-07 at 12:26 +0200, Michael S. Tsirkin wrote:

On Thu, Nov 07, 2013 at 03:32:29PM +0800, Amos Kong wrote:

On Thu, Nov 07, 2013 at 08:59:22AM +0200, Michael S. Tsirkin wrote:

On Tue, Nov 05, 2013 at 07:17:18PM +0800, Amos Kong wrote:

We currently just update the HMP NIC info when the last bit of macaddr
is written. This assumes that guest driver will write all the macaddr
from bit 0 to bit 5 when it changes the macaddr, this is the current
behavior of linux driver (e1000/rtl8139cp), but we can't do this
assumption.

The macaddr that is used for rx-filter will be updated when every bit
is changed. This patch updates the e1000/rtl8139 nic to update HMP NIC
info when every bit is changed. It will be same as virtio-net.

Signed-off-by: Amos Kong 


I'm not sure I buy this.

If we actually implement e.g. mac change notifications,
sending them on writes of random bytes will confuse
the host.


This patch just effects the monitor display of macaddr.
During each writing, the macaddr is used for rx-filter is really
changed.

In the real hardware, it supports to just write part of bits,
the rx-filtering is effected by every bit writing.


Yes but again, the window can just be too small to matter
on real hardware.

Our emulation is not perfect, fixing this to be just like real
hardware just might expose other bugs we can't fix
that easily.


If we were to implement mac change notification, then every partial
update would send a notify and the host.  Is that a problem?  It seems
no different than if the device had an atomic mac update procedure and
the guest admin changed the mac several times.


Yes, but the issue is exercebated in the non-atomic case.  RTL8139
is the worst since it has byte oriented writes so that for ever mac
change, we have the worst case potential to generate 6 notificates
(assuming libvirt is so fast as to pick up ever change).

Additionally, once libvirt is updated, this would cause rather serious
churn on the host as for ever update, libvirt is going to push the
address down to the physical host nic and the fewer of these updates
we do the better.



The problem with assuming that a given byte is always written last is
that unless the hardware spec identifies an order, we're just basing our
code on examples where we know what the guest driver does, either by
code inspection or access tracing.  If there are examples of guests that
write the last byte first, then the host will never know the correct mac
address.  Maybe there are no guests that use the wrong order, but that's
a pretty exhaustive search.

The patch doesn't change anything about how the NIC operates, only when
mac changes are updated.  During an update the mac is in a transitory
state and we can't go back in time to make it atomic on this hardware
design to avoid a window where the wrong mac may be seen.  I think the
patch is the right thing to do.  Thanks,



Reporting half-complete state is not the right thing, IMO.  Right now,
it's doesn't have much impact, but if we start writing these addresses
to the host nic, then this will have a much bigger impact as I said above.

-vlad


Alex


I would say let's leave e1000/rtl8139 well alone unless
we see guests that actually write mac without touching
the last byte.


At least, linux rtl8139cp/e1000 writes macaddr from bit 0 to bit 5.
It works to just watch the last bit.

Thanks, Amos


Then think of ways to detect when mac change is done
for these.


---
  hw/net/e1000.c   | 2 +-
  hw/net/rtl8139.c | 5 +
  2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index ec8ecd7..2d60639 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1110,7 +1110,7 @@ mac_writereg(E1000State *s, int index, uint32_t val)

  s->mac_reg[index] = val;

-if (index == RA + 1) {
+if (index == RA || index == RA + 1) {
  macaddr[0] = cpu_to_le32(s->mac_reg[RA]);
  macaddr[1] = cpu_to_le32(s->mac_reg[RA + 1]);
  qemu_format_nic_info_str(qemu_get_queue(s->nic), (uint8_t *)macaddr);
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 5329f44..7f2b4db 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -2741,10 +2741,7 @@ static void rtl8139_io_writeb(void *opaque, uint8_t 
addr, uint32_t val)

  switch (addr)
  {
-case MAC0 ... MAC0+4:
-s->phys[addr - MAC0] = val;
-break;
-case MAC0+5:
+case MAC0 ... MAC0+5:
  s->phys[addr - MAC0] = val;
  qemu_format_nic_info_str(qemu_get_queue(s->nic), s->phys);
  break;
--
1.8.3.1












Re: [Qemu-devel] [PATCH] e1000/rtl8139: update HMP NIC when every bit is written

2013-11-07 Thread Vlad Yasevich

On 11/07/2013 10:27 AM, Michael S. Tsirkin wrote:

On Thu, Nov 07, 2013 at 07:33:57AM -0700, Alex Williamson wrote:

On Thu, 2013-11-07 at 12:26 +0200, Michael S. Tsirkin wrote:

On Thu, Nov 07, 2013 at 03:32:29PM +0800, Amos Kong wrote:

On Thu, Nov 07, 2013 at 08:59:22AM +0200, Michael S. Tsirkin wrote:

On Tue, Nov 05, 2013 at 07:17:18PM +0800, Amos Kong wrote:

We currently just update the HMP NIC info when the last bit of macaddr
is written. This assumes that guest driver will write all the macaddr
from bit 0 to bit 5 when it changes the macaddr, this is the current
behavior of linux driver (e1000/rtl8139cp), but we can't do this
assumption.

The macaddr that is used for rx-filter will be updated when every bit
is changed. This patch updates the e1000/rtl8139 nic to update HMP NIC
info when every bit is changed. It will be same as virtio-net.

Signed-off-by: Amos Kong 


I'm not sure I buy this.

If we actually implement e.g. mac change notifications,
sending them on writes of random bytes will confuse
the host.


This patch just effects the monitor display of macaddr.
During each writing, the macaddr is used for rx-filter is really
changed.

In the real hardware, it supports to just write part of bits,
the rx-filtering is effected by every bit writing.


Yes but again, the window can just be too small to matter
on real hardware.

Our emulation is not perfect, fixing this to be just like real
hardware just might expose other bugs we can't fix
that easily.


If we were to implement mac change notification, then every partial
update would send a notify and the host.  Is that a problem?  It seems
no different than if the device had an atomic mac update procedure and
the guest admin changed the mac several times.


I think modern nics make address updates atomic.
Problem is, we are emulating an ancient one,
so to make host configuration of a modern one
reasonable we need to resort to tricks.


The problem with assuming that a given byte is always written last is
that unless the hardware spec identifies an order, we're just basing our
code on examples where we know what the guest driver does, either by
code inspection or access tracing.  If there are examples of guests that
write the last byte first, then the host will never know the correct mac
address.  Maybe there are no guests that use the wrong order, but that's
a pretty exhaustive search.


I agree what we have is a hack. Maybe we need some better hacks.
For example, maybe we should update mac at link state change
(I think  link is always down when mac is updated?).
Needs some thought.


I thought this too, but checking recent linux kernel, e1000 and rtl8139 
seem to allow live mac change so link is up.


So here is a stupid, untested patch for e1000 to notify only once:
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 8387443..b99eba4 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -149,6 +149,10 @@ typedef struct E1000State_st {
 #define E1000_FLAG_AUTONEG (1 << E1000_FLAG_AUTONEG_BIT)
 #define E1000_FLAG_MIT (1 << E1000_FLAG_MIT_BIT)
 uint32_t compat_flags;
+uint32_t mac_change_flags;
+#define E1000_RA0_CHANGED 0
+#define E1000_RA1_CHANGED 1
+#define E1000_RA_ALL_CHANGED (E1000_RA0_CHANGED|E1000_RA1_CHANGED)
 } E1000State;

 #define TYPE_E1000 "e1000"
@@ -402,6 +406,7 @@ static void e1000_reset(void *opaque)
 d->mac_reg[RA + 1] |= (i < 2) ? macaddr[i + 4] << (8 * i) : 0;
 }
 qemu_format_nic_info_str(qemu_get_queue(d->nic), macaddr);
+d->mac_change_flags = 0;
 }

 static void
@@ -1106,10 +,20 @@ mac_writereg(E1000State *s, int index, uint32_t val)

 s->mac_reg[index] = val;

-if (index == RA + 1) {
+switch (index) {
+   case RA:
+   s->mac_change_flags |= E1000_RA0_CHANGED;
+   break;
+   case (RA + 1):
+   s->mac_change_flags |= E1000_RA1_CHANGED;
+   break;
+}
+
+if (!(s->mac_change_flags ^ E1000_RA_ALL_CHANGED)) {
 macaddr[0] = cpu_to_le32(s->mac_reg[RA]);
 macaddr[1] = cpu_to_le32(s->mac_reg[RA + 1]);
 qemu_format_nic_info_str(qemu_get_queue(s->nic), (uint8_t 
*)macaddr);

+s->mac_change_flags = 0;
 }
 }

Any thoughts?

-vlad



Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Paolo Bonzini
Il 07/11/2013 22:24, Marcel Apfelbaum ha scritto:
> Thank you Laszlo for the detailed info!
> I think the problem is right above. Why pci-hole and system.flash collide?
> IMHO we should not play with priorities here, better solve the collision.

We need to audit all the other boards that support PCI...  I'll take a
look tomorrow since you guys are off.

Paolo



Re: [Qemu-devel] [edk2] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Laszlo Ersek
On 11/07/13 22:21, Paolo Bonzini wrote:
> Il 07/11/2013 22:12, Laszlo Ersek ha scritto:
>> -7ffe (prio 0, RW): system
>>   [...]
>>   6000- (prio 0, RW): alias pci-hole @pci 
>> 6000-
>>   [...]
>>   ffe0- (prio 0, R-): system.flash
>> [...]
> 
> Priorities are not "transitive" across aliases; once you use an alias to
> map a region, the alias's priority counts, not the target region's
> priority.  So the INT_MIN priority for pci-master-abort counts *within
> the alias*, but the choice between pci-hole and system.flash is only
> affected by the priorities of pci-hole and system.flash.

Right. It's also documented in docs/memory.txt -- Peter's recent
addition I think?

> You could give a smaller priority (-1 or INT_MIN) to pci-hole and just
> let it occupy the whole address space, from 0 to UINT64_MAX.  Or perhaps
> the pci-hole alias is too large and it should end before the system
> flash area.  Both solutions should work.

I did reorder pci-hole and system.flash, but rather than lowering
pci-hole, I raised system.flash. I have no preference.

Thanks
Laszlo



Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Marcel Apfelbaum
On Thu, 2013-11-07 at 22:12 +0100, Laszlo Ersek wrote:
> This is a QEMU bug report, only disguised as an edk2-devel followup.
> 
> The problem in a nutshell is that the OVMF binary, placed into pflash
> (read-only KVM memslot) used to run in qemu-1.6, but it fails to start
> in qemu-1.7. The OVMF reset vector reads as 0xFF bytes.
> 
> (Jordan and myself started writing these emails in parallel.)
> 
> On 11/07/13 21:27, Jordan Justen wrote:
> > On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum
> >  wrote:
> >> The commit:
> >>
> >> Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6
> >> Author: Marcel Apfelbaum 
> >> Date:   Mon Sep 16 11:21:16 2013 +0300
> >>
> >> hw/pci: partially handle pci master abort
> >>
> >> introduced a regression on make check:
> >
> > Laszlo pointed out that my OVMF flash support series was not working
> > with QEMU master. It was working with QEMU 1.6.0. I then bisected the
> > issue to this commit. It seems this commit regresses -pflash support
> > for both KVM and non-KVM modes.
> >
> > Can you reproduce the issue this with command?
> > x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin
> > (with or without adding -enable-kvm)
> >
> > I tried adding this patch ("exec: fix regression by making
> > system-memory region UINT64_MAX size") and it did not help the -pflash
> > regression.
> 
> 
> From the edk2-devel discussion:
> 
> 
> On 11/07/13 19:07, Laszlo Ersek wrote:
> > On 11/07/13 17:28, Laszlo Ersek wrote:
> >> On 11/06/13 23:29, Jordan Justen wrote:
> >>> https://github.com/jljusten/edk2.git ovmf-nvvars-v2
> >>>
> >>> This series implements support for QEMU's emulated
> >>> system flash.
> >>>
> >>> This allows for persistent UEFI non-volatile variables.
> >>>
> >>> Previously we attempted to emulate non-volatile
> >>> variables in a few ways, but each of them would fail
> >>> in particular situations.
> >>>
> >>> To support flash based variable storage, we:
> >>>  * Reserve space in the firmware device image
> >>>  * Add a new flash firmware volume block driver
> >>>  * Disable EmuVariableFvbRuntimeDxe (at runtime) if QEMU
> >>>flash is available.
> >>>
> >>> To use:
> >>>  * QEMU version 1.1 or newer is required without KVM
> >>>  * KVM support requires Linux 3.7 and QEMU 1.6
> >>>  * Run QEMU with -pflash OVMF.fd instead of -L or -bios
> >>>or use OvmfPkg/build.sh --enable-flash qemu ...
> >>>  * If QEMU is 1.6 or newer, then OvmfPkg/build.sh will
> >>>automatically enable flash when running QEMU, so in
> >>>that case --enable-flash is not required.
> >>>
> >>> See also:
> >>>  * http://wiki.qemu.org/Features/PC_System_Flash
> >>>
> >>> v2:
> >>>  * Replace
> >>>  "OvmfPkg/AcpiPlatformDxe/Qemu: Allow high runtime memory regions"
> >>>with
> >>>  "OvmfPkg/AcpiPlatformDxe/Qemu: Decrease upper limit for PCI window 
> >>> 32"
> >>>  * Separate portions of
> >>>  "OvmfPkg/build.sh: Support --enable-flash switch"
> >>>out into a new patch
> >>>  "OvmfPkg/build.sh: Enable flash for QEMU >= 1.6"
> >>>  * Add "OvmfPkg/README: Add information about OVMF flash layout"
> >>>  * Update "OvmfPkg: Add NV Variable storage within FD" to also change the
> >>>size of PcdVariableStoreSize
> >>>  * Update commit messages on a couple patches for better clarity
> >>
> >> Tested in the following configurations:
> >>
> >> (1) RHEL-6 host (no KVM support, no qemu support -- that is,
> >> regression test): RHEL-6, Fedora 19, Windows 2008 R2 (needs CSM),
> >> Windows 2012 R2 boot tests work OK.
> >>
> >> (2) 3.10-based host kernel, qemu v1.7.0-rc0, Xeon W3550 host CPU:
> >> Unfortunately qemu dies with the following KVM trace:
> >>
> >>   KVM internal error. Suberror: 1
> >>   emulation failure
> >>   EAX= EBX= ECX= EDX=0623
> >>   ESI= EDI= EBP= ESP=
> >>   EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
> >>   ES =   9300
> >>   CS =f000   9b00
> >>   SS =   9300
> >>   DS =   9300
> >>   FS =   9300
> >>   GS =   9300
> >>   LDT=   8200
> >>   TR =   8b00
> >>   GDT=  
> >>   IDT=  
> >>   CR0=6010 CR2= CR3= CR4=
> >>   DR0= DR1= DR2= 
> >> DR3=
> >>   DR6=0ff0 DR7=0400
> >>   EFER=
> >>   Code=ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ff 
> >> ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> >> ff ff ff
> >>
> >> I'm quite unsure, but the CS:IP value seems to point at the reset
> >> vector, no? However, the Code=... log only shows 0xFF bytes.
> >>
> >> (3) 3.10.17 host kernel, qemu v1.7.0-rc0,

Re: [Qemu-devel] [edk2] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Paolo Bonzini
Il 07/11/2013 22:12, Laszlo Ersek ha scritto:
> -7ffe (prio 0, RW): system
>   [...]
>   6000- (prio 0, RW): alias pci-hole @pci 
> 6000-
>   [...]
>   ffe0- (prio 0, R-): system.flash
> [...]

Priorities are not "transitive" across aliases; once you use an alias to
map a region, the alias's priority counts, not the target region's
priority.  So the INT_MIN priority for pci-master-abort counts *within
the alias*, but the choice between pci-hole and system.flash is only
affected by the priorities of pci-hole and system.flash.

You could give a smaller priority (-1 or INT_MIN) to pci-hole and just
let it occupy the whole address space, from 0 to UINT64_MAX.  Or perhaps
the pci-hole alias is too large and it should end before the system
flash area.  Both solutions should work.

Paolo



Re: [Qemu-devel] [PATCH 1/2] block: Add bs->node_name to hold the name of a bs node of the bs graph.

2013-11-07 Thread Benoît Canet
Le Thursday 07 Nov 2013 à 15:54:09 (-0500), Jeff Cody a écrit :
> On Thu, Nov 07, 2013 at 04:01:42PM +0100, Benoît Canet wrote:
> > Add the minimum of code to prepare the followings patches.
> > 
> > If no node_name is provided to bdrv_new the bs->node_name is set to 
> > "undefined".
> > This will allow to have some default string to communicate in QMP and HMP.
> > This also make "undefined" a reserved string for bs->node_name.
> 
> Hi Benoît,
> 
> Is it necessary to have a reserved string, or would an empty
> null-terminated string be able to implicitly denote the name as
> undefined?

This would works too and just report the "undefined" logic to hmp printing code.
I'll do this.

> 
> > 
> > Signed-off-by: Benoit Canet 
> > ---
> >  block.c   | 70 
> > +++
> >  block/blkverify.c |  2 +-
> >  block/iscsi.c |  2 +-
> >  block/vmdk.c  |  2 +-
> >  block/vvfat.c |  4 +--
> >  blockdev.c|  8 +++---
> >  hw/block/xen_disk.c   |  2 +-
> >  include/block/block.h |  3 +-
> >  include/block/block_int.h |  9 +-
> >  qemu-img.c|  6 ++--
> >  qemu-io.c |  2 +-
> >  qemu-nbd.c|  2 +-
> >  12 files changed, 77 insertions(+), 35 deletions(-)
> > 
> > diff --git a/block.c b/block.c
> > index fd05a80..230e71a 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -89,6 +89,9 @@ static int coroutine_fn 
> > bdrv_co_do_write_zeroes(BlockDriverState *bs,
> >  static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
> >  QTAILQ_HEAD_INITIALIZER(bdrv_states);
> >  
> > +static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
> > +QTAILQ_HEAD_INITIALIZER(graph_bdrv_states);
> > +
> >  static QLIST_HEAD(, BlockDriver) bdrv_drivers =
> >  QLIST_HEAD_INITIALIZER(bdrv_drivers);
> >  
> > @@ -318,14 +321,26 @@ void bdrv_register(BlockDriver *bdrv)
> >  }
> >  
> >  /* create a new block device (by default it is empty) */
> > -BlockDriverState *bdrv_new(const char *device_name)
> > +BlockDriverState *bdrv_new(const char *device_name, const char *node_name)
> >  {
> >  BlockDriverState *bs;
> >  
> >  bs = g_malloc0(sizeof(BlockDriverState));
> >  pstrcpy(bs->device_name, sizeof(bs->device_name), device_name);
> >  if (device_name[0] != '\0') {
> > -QTAILQ_INSERT_TAIL(&bdrv_states, bs, list);
> > +QTAILQ_INSERT_TAIL(&bdrv_states, bs, device_list);
> > +}
> > +/* if node name is given store it in bs and insert bs in the graph bs 
> > list
> > + * note: undefined is a reserved node name
> > + */
> > +if (node_name &&
> > +node_name[0] != '\0' &&
> > +strcmp(node_name, "undefined")) {
> > +pstrcpy(bs->node_name, sizeof(bs->node_name), node_name);
> > +QTAILQ_INSERT_TAIL(&graph_bdrv_states, bs, node_list);
> > +/* else set the bs node name to undefined for QMP and HMP */
> > +} else {
> > +sprintf(bs->node_name, "undefined");
> >  }
> >  bdrv_iostatus_disable(bs);
> >  notifier_list_init(&bs->close_notifiers);
> > @@ -870,7 +885,7 @@ int bdrv_file_open(BlockDriverState **pbs, const char 
> > *filename,
> >  options = qdict_new();
> >  }
> >  
> > -bs = bdrv_new("");
> > +bs = bdrv_new("", NULL);
> >  bs->options = options;
> >  options = qdict_clone_shallow(options);
> >  
> > @@ -992,7 +1007,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
> > *options, Error **errp)
> > sizeof(backing_filename));
> >  }
> >  
> > -bs->backing_hd = bdrv_new("");
> > +bs->backing_hd = bdrv_new("", NULL);
> >  
> >  if (bs->backing_format[0] != '\0') {
> >  back_drv = bdrv_find_format(bs->backing_format);
> > @@ -1062,7 +1077,7 @@ int bdrv_open(BlockDriverState *bs, const char 
> > *filename, QDict *options,
> > instead of opening 'filename' directly */
> >  
> >  /* if there is a backing file, use it */
> > -bs1 = bdrv_new("");
> > +bs1 = bdrv_new("", NULL);
> >  ret = bdrv_open(bs1, filename, NULL, 0, drv, &local_err);
> >  if (ret < 0) {
> >  bdrv_unref(bs1);
> > @@ -1495,7 +1510,7 @@ void bdrv_close_all(void)
> >  {
> >  BlockDriverState *bs;
> >  
> > -QTAILQ_FOREACH(bs, &bdrv_states, list) {
> > +QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> >  bdrv_close(bs);
> >  }
> >  }
> > @@ -1524,7 +1539,7 @@ static bool bdrv_requests_pending(BlockDriverState 
> > *bs)
> >  static bool bdrv_requests_pending_all(void)
> >  {
> >  BlockDriverState *bs;
> > -QTAILQ_FOREACH(bs, &bdrv_states, list) {
> > +QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> >  if (bdrv_requests_pending(bs)) {
> >  return true;
> >  }
> > @@ -1554,7 +1569,7 @@ void bdrv_drain_all(void)
> >  /* FIXME: We do not have timer support here, so this is effectiv

Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Laszlo Ersek
This is a QEMU bug report, only disguised as an edk2-devel followup.

The problem in a nutshell is that the OVMF binary, placed into pflash
(read-only KVM memslot) used to run in qemu-1.6, but it fails to start
in qemu-1.7. The OVMF reset vector reads as 0xFF bytes.

(Jordan and myself started writing these emails in parallel.)

On 11/07/13 21:27, Jordan Justen wrote:
> On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum
>  wrote:
>> The commit:
>>
>> Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6
>> Author: Marcel Apfelbaum 
>> Date:   Mon Sep 16 11:21:16 2013 +0300
>>
>> hw/pci: partially handle pci master abort
>>
>> introduced a regression on make check:
>
> Laszlo pointed out that my OVMF flash support series was not working
> with QEMU master. It was working with QEMU 1.6.0. I then bisected the
> issue to this commit. It seems this commit regresses -pflash support
> for both KVM and non-KVM modes.
>
> Can you reproduce the issue this with command?
> x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin
> (with or without adding -enable-kvm)
>
> I tried adding this patch ("exec: fix regression by making
> system-memory region UINT64_MAX size") and it did not help the -pflash
> regression.


>From the edk2-devel discussion:


On 11/07/13 19:07, Laszlo Ersek wrote:
> On 11/07/13 17:28, Laszlo Ersek wrote:
>> On 11/06/13 23:29, Jordan Justen wrote:
>>> https://github.com/jljusten/edk2.git ovmf-nvvars-v2
>>>
>>> This series implements support for QEMU's emulated
>>> system flash.
>>>
>>> This allows for persistent UEFI non-volatile variables.
>>>
>>> Previously we attempted to emulate non-volatile
>>> variables in a few ways, but each of them would fail
>>> in particular situations.
>>>
>>> To support flash based variable storage, we:
>>>  * Reserve space in the firmware device image
>>>  * Add a new flash firmware volume block driver
>>>  * Disable EmuVariableFvbRuntimeDxe (at runtime) if QEMU
>>>flash is available.
>>>
>>> To use:
>>>  * QEMU version 1.1 or newer is required without KVM
>>>  * KVM support requires Linux 3.7 and QEMU 1.6
>>>  * Run QEMU with -pflash OVMF.fd instead of -L or -bios
>>>or use OvmfPkg/build.sh --enable-flash qemu ...
>>>  * If QEMU is 1.6 or newer, then OvmfPkg/build.sh will
>>>automatically enable flash when running QEMU, so in
>>>that case --enable-flash is not required.
>>>
>>> See also:
>>>  * http://wiki.qemu.org/Features/PC_System_Flash
>>>
>>> v2:
>>>  * Replace
>>>  "OvmfPkg/AcpiPlatformDxe/Qemu: Allow high runtime memory regions"
>>>with
>>>  "OvmfPkg/AcpiPlatformDxe/Qemu: Decrease upper limit for PCI window 32"
>>>  * Separate portions of
>>>  "OvmfPkg/build.sh: Support --enable-flash switch"
>>>out into a new patch
>>>  "OvmfPkg/build.sh: Enable flash for QEMU >= 1.6"
>>>  * Add "OvmfPkg/README: Add information about OVMF flash layout"
>>>  * Update "OvmfPkg: Add NV Variable storage within FD" to also change the
>>>size of PcdVariableStoreSize
>>>  * Update commit messages on a couple patches for better clarity
>>
>> Tested in the following configurations:
>>
>> (1) RHEL-6 host (no KVM support, no qemu support -- that is,
>> regression test): RHEL-6, Fedora 19, Windows 2008 R2 (needs CSM),
>> Windows 2012 R2 boot tests work OK.
>>
>> (2) 3.10-based host kernel, qemu v1.7.0-rc0, Xeon W3550 host CPU:
>> Unfortunately qemu dies with the following KVM trace:
>>
>>   KVM internal error. Suberror: 1
>>   emulation failure
>>   EAX= EBX= ECX= EDX=0623
>>   ESI= EDI= EBP= ESP=
>>   EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
>>   ES =   9300
>>   CS =f000   9b00
>>   SS =   9300
>>   DS =   9300
>>   FS =   9300
>>   GS =   9300
>>   LDT=   8200
>>   TR =   8b00
>>   GDT=  
>>   IDT=  
>>   CR0=6010 CR2= CR3= CR4=
>>   DR0= DR1= DR2= 
>> DR3=
>>   DR6=0ff0 DR7=0400
>>   EFER=
>>   Code=ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ff 
>> ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
>> ff ff ff
>>
>> I'm quite unsure, but the CS:IP value seems to point at the reset
>> vector, no? However, the Code=... log only shows 0xFF bytes.
>>
>> (3) 3.10.17 host kernel, qemu v1.7.0-rc0, Athlon II X2 B22 host CPU:
>> almost identical KVM error, with the following difference:
>>
>>   --- vmx 2013-11-07 17:23:45.612973935 +0100
>>   +++ svm 2013-11-07 17:24:17.237973059 +0100
>>   @@ -1,6 +1,6 @@
>>KVM internal error. Suberror: 1
>>emulation failure
>>   -EAX= EBX=00

Re: [Qemu-devel] [PATCH 1/2] block: Add bs->node_name to hold the name of a bs node of the bs graph.

2013-11-07 Thread Benoît Canet
Le Thursday 07 Nov 2013 à 13:23:43 (-0700), Eric Blake a écrit :
> On 11/07/2013 08:01 AM, Benoît Canet wrote:
> > Add the minimum of code to prepare the followings patches.
> > 
> > If no node_name is provided to bdrv_new the bs->node_name is set to 
> > "undefined".
> > This will allow to have some default string to communicate in QMP and HMP.
> > This also make "undefined" a reserved string for bs->node_name.
> > 
> > Signed-off-by: Benoit Canet 
> > ---
> 
> > +/* if node name is given store it in bs and insert bs in the graph bs 
> > list
> > + * note: undefined is a reserved node name
> > + */
> > +if (node_name &&
> > +node_name[0] != '\0' &&
> > +strcmp(node_name, "undefined")) {
> > +pstrcpy(bs->node_name, sizeof(bs->node_name), node_name);
> > +QTAILQ_INSERT_TAIL(&graph_bdrv_states, bs, node_list);
> > +/* else set the bs node name to undefined for QMP and HMP */
> > +} else {
> > +sprintf(bs->node_name, "undefined");
> 
> strcpy() is more efficient than sprintf() with no % in the format string.
> 
> 
> >  
> > +/* This function is to find a node in the bs graph */
> > +BlockDriverState *bdrv_find_node(const char *node_name)
> > +{
> > +BlockDriverState *bs;
> > +
> 
> Should this function assert that node_name is not "undefined"?
> 
> 
> > +++ b/include/block/block_int.h
> > @@ -297,11 +297,18 @@ struct BlockDriverState {
> >  BlockdevOnError on_read_error, on_write_error;
> >  bool iostatus_enabled;
> >  BlockDeviceIoStatus iostatus;
> > +
> > +/* the following member give a name to every node on the 
> > BlockDriverState
> 
> s/give/gives/
> 
> > + * graph.
> > + */
> > +char node_name[32];
> > +QTAILQ_ENTRY(BlockDriverState) node_list;
> > +/* Device name is the name associated with the "drive" the guest see */
> 
> s/see/sees/
> 
> >  char device_name[32];
> > +QTAILQ_ENTRY(BlockDriverState) device_list;
> 
> Maybe I missed it, but is there code that ensures there are no duplicate
> node names (other than the magic "undefined")?
> 
Your are right I will add some checkings.

> Seems like it's moving in the right direction, although I'm not sure
> it's worth applying this until we know the qapi for working with node
> names makes sense.

I am not seeking for fast commit of these patch, I am only trying to avoid
posting a long series at once.

Best regards

Benoît

> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 





Re: [Qemu-devel] [PATCH 1/2] block: Add bs->node_name to hold the name of a bs node of the bs graph.

2013-11-07 Thread Jeff Cody
On Thu, Nov 07, 2013 at 04:01:42PM +0100, Benoît Canet wrote:
> Add the minimum of code to prepare the followings patches.
> 
> If no node_name is provided to bdrv_new the bs->node_name is set to 
> "undefined".
> This will allow to have some default string to communicate in QMP and HMP.
> This also make "undefined" a reserved string for bs->node_name.

Hi Benoît,

Is it necessary to have a reserved string, or would an empty
null-terminated string be able to implicitly denote the name as
undefined?

> 
> Signed-off-by: Benoit Canet 
> ---
>  block.c   | 70 
> +++
>  block/blkverify.c |  2 +-
>  block/iscsi.c |  2 +-
>  block/vmdk.c  |  2 +-
>  block/vvfat.c |  4 +--
>  blockdev.c|  8 +++---
>  hw/block/xen_disk.c   |  2 +-
>  include/block/block.h |  3 +-
>  include/block/block_int.h |  9 +-
>  qemu-img.c|  6 ++--
>  qemu-io.c |  2 +-
>  qemu-nbd.c|  2 +-
>  12 files changed, 77 insertions(+), 35 deletions(-)
> 
> diff --git a/block.c b/block.c
> index fd05a80..230e71a 100644
> --- a/block.c
> +++ b/block.c
> @@ -89,6 +89,9 @@ static int coroutine_fn 
> bdrv_co_do_write_zeroes(BlockDriverState *bs,
>  static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
>  QTAILQ_HEAD_INITIALIZER(bdrv_states);
>  
> +static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
> +QTAILQ_HEAD_INITIALIZER(graph_bdrv_states);
> +
>  static QLIST_HEAD(, BlockDriver) bdrv_drivers =
>  QLIST_HEAD_INITIALIZER(bdrv_drivers);
>  
> @@ -318,14 +321,26 @@ void bdrv_register(BlockDriver *bdrv)
>  }
>  
>  /* create a new block device (by default it is empty) */
> -BlockDriverState *bdrv_new(const char *device_name)
> +BlockDriverState *bdrv_new(const char *device_name, const char *node_name)
>  {
>  BlockDriverState *bs;
>  
>  bs = g_malloc0(sizeof(BlockDriverState));
>  pstrcpy(bs->device_name, sizeof(bs->device_name), device_name);
>  if (device_name[0] != '\0') {
> -QTAILQ_INSERT_TAIL(&bdrv_states, bs, list);
> +QTAILQ_INSERT_TAIL(&bdrv_states, bs, device_list);
> +}
> +/* if node name is given store it in bs and insert bs in the graph bs 
> list
> + * note: undefined is a reserved node name
> + */
> +if (node_name &&
> +node_name[0] != '\0' &&
> +strcmp(node_name, "undefined")) {
> +pstrcpy(bs->node_name, sizeof(bs->node_name), node_name);
> +QTAILQ_INSERT_TAIL(&graph_bdrv_states, bs, node_list);
> +/* else set the bs node name to undefined for QMP and HMP */
> +} else {
> +sprintf(bs->node_name, "undefined");
>  }
>  bdrv_iostatus_disable(bs);
>  notifier_list_init(&bs->close_notifiers);
> @@ -870,7 +885,7 @@ int bdrv_file_open(BlockDriverState **pbs, const char 
> *filename,
>  options = qdict_new();
>  }
>  
> -bs = bdrv_new("");
> +bs = bdrv_new("", NULL);
>  bs->options = options;
>  options = qdict_clone_shallow(options);
>  
> @@ -992,7 +1007,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
> *options, Error **errp)
> sizeof(backing_filename));
>  }
>  
> -bs->backing_hd = bdrv_new("");
> +bs->backing_hd = bdrv_new("", NULL);
>  
>  if (bs->backing_format[0] != '\0') {
>  back_drv = bdrv_find_format(bs->backing_format);
> @@ -1062,7 +1077,7 @@ int bdrv_open(BlockDriverState *bs, const char 
> *filename, QDict *options,
> instead of opening 'filename' directly */
>  
>  /* if there is a backing file, use it */
> -bs1 = bdrv_new("");
> +bs1 = bdrv_new("", NULL);
>  ret = bdrv_open(bs1, filename, NULL, 0, drv, &local_err);
>  if (ret < 0) {
>  bdrv_unref(bs1);
> @@ -1495,7 +1510,7 @@ void bdrv_close_all(void)
>  {
>  BlockDriverState *bs;
>  
> -QTAILQ_FOREACH(bs, &bdrv_states, list) {
> +QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
>  bdrv_close(bs);
>  }
>  }
> @@ -1524,7 +1539,7 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
>  static bool bdrv_requests_pending_all(void)
>  {
>  BlockDriverState *bs;
> -QTAILQ_FOREACH(bs, &bdrv_states, list) {
> +QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
>  if (bdrv_requests_pending(bs)) {
>  return true;
>  }
> @@ -1554,7 +1569,7 @@ void bdrv_drain_all(void)
>  /* FIXME: We do not have timer support here, so this is effectively
>   * a busy wait.
>   */
> -QTAILQ_FOREACH(bs, &bdrv_states, list) {
> +QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
>  if (bdrv_start_throttled_reqs(bs)) {
>  busy = true;
>  }
> @@ -1570,7 +1585,7 @@ void bdrv_drain_all(void)
>  void bdrv_make_anon(BlockDriverState *bs)
>  {
>  if (bs->device_name[0] != '\0') {
> -QTAILQ_REM

Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Marcel Apfelbaum
On Thu, 2013-11-07 at 12:27 -0800, Jordan Justen wrote:
> On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum  wrote:
> > The commit:
> >
> > Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6
> > Author: Marcel Apfelbaum 
> > Date:   Mon Sep 16 11:21:16 2013 +0300
> >
> > hw/pci: partially handle pci master abort
> >
> > introduced a regression on make check:
> 
> Laszlo pointed out that my OVMF flash support series was not working
> with QEMU master. It was working with QEMU 1.6.0. I then bisected the
> issue to this commit. It seems this commit regresses -pflash support
> for both KVM and non-KVM modes.
> 
> Can you reproduce the issue this with command?
> x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin
> (with or without adding -enable-kvm)
Thanks Jordan for pointing this out.
This patch revealed a lot of hidden issues inside qemu ...
I succeeded to reproduce the issue.

I saw that get_page_addr_code associated address 0xfff0
(after calling io_tlb_to_region) with the new master abort region.
It breaks because memory_region_is_unassigned returns true for the new region.
It is interesting what happened before this region was added.

I will investigate this issue,
Thanks,
Marcel

 

> I tried adding this patch ("exec: fix regression by making
> system-memory region UINT64_MAX size") and it did not help the -pflash
> regression.
> 
> -Jordan






Re: [Qemu-devel] [PATCH 2/2] block: Allow the user to define "node-name" option.

2013-11-07 Thread Eric Blake
On 11/07/2013 08:01 AM, Benoît Canet wrote:
> As node-name is a separate name space as device-name we can enable it's

s/space as/space from/
s/it's/its/

> definition right now: nobody will use it so no harm involved.
> 
> Signed-off-by: Benoit Canet 
> ---
>  block.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)

Shouldn't blockdev-add have a QMP counterpart for setting node name
during device hotplug?  Also, is there a QMP command for inspecting node
names yet?  This feels like a write-only interface if we don't have more
usage of it in place.

Don't get me wrong - I think we definitely want this, but in the context
of a bigger series rather than by itself.

> 
> diff --git a/block.c b/block.c
> index 230e71a..132981f 100644
> --- a/block.c
> +++ b/block.c
> @@ -885,7 +885,8 @@ int bdrv_file_open(BlockDriverState **pbs, const char 
> *filename,
>  options = qdict_new();
>  }
>  
> -bs = bdrv_new("", NULL);
> +bs = bdrv_new("", qdict_get_try_str(options, "node-name"));
> +qdict_del(options, "node-name");
>  bs->options = options;
>  options = qdict_clone_shallow(options);
>  
> @@ -1007,7 +1008,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
> *options, Error **errp)
> sizeof(backing_filename));
>  }
>  
> -bs->backing_hd = bdrv_new("", NULL);
> +bs->backing_hd = bdrv_new("", qdict_get_try_str(options, "node-name"));
> +qdict_del(options, "node-name");
>  
>  if (bs->backing_format[0] != '\0') {
>  back_drv = bdrv_find_format(bs->backing_format);
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH for-1.8 61/61] target-i386: Deconstruct thecpu_T arrayy

2013-11-07 Thread Richard Henderson
On 11/07/2013 08:53 PM, Alex Bennée wrote:
> 
> r...@twiddle.net writes:
> 
>> All references to cpu_T are done with a constant index.  It aids
>> readability to decompose the array into two scalar variables.
> 
> 
> I'm not necessarily disagreeing with the readability argument but does
> this make any difference to the eventual code generated? 
> 

None at all.


r~



Re: [Qemu-devel] [PATCH 0/2] Giving names to graph's BlockDriverState

2013-11-07 Thread Eric Blake
On 11/07/2013 08:01 AM, Benoît Canet wrote:
> The following series introduce a new file.node-name property in order to be
> able to give a name to each BlockDriverState of the graph.
> 
> It also define "undefined" as a special value for node-name; a value that 
> will be
> used to indicate to the management that it can not manipulate a node because 
> it
> was not named.

Is it necessary to use the actual string "undefined" to mean unnamed, or
can we be careful and use NULL for unnamed, and all other pointers as
the user-provided name?  Either way, we have to special-case code to
check for the sentinel, but a NULL check is faster than a strcmp(), plus
it feels slightly nicer to not consume a value out of the user's namespace.

> 
> After this patchset is merged I would like to take care of presenting the 
> graph
> to the management. (HMP &&/|| QMP)
> 
> Eric: Do you have some ideas on this topic ?

Overall, I'm looking forward to getting this into qemu 1.8; I'm still
waiting to see what else you propose for QMP interfaces.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v2] e1000: initial link negotiation on mac osx

2013-11-07 Thread Gabriel L. Somlo
Some guest operating systems' drivers (particularly Mac OS X)
expect the link state to be pre-initialized by an earlier
component such as a proprietary BIOS. This patch injects
additional LSC events upon PHY reset, allowing the OS X driver
to successfully complete initial link negotiation. This is a
follow-up to commit 372254c6e5c078fb13b236bb648d2b9b2b0c70f1,
which works around the OS X driver's failure to properly set
up the MAC address.

Signed-off-by: Gabriel Somlo 
---

On Thu, Nov 07, 2013 at 08:28:47PM +0100, Paolo Bonzini wrote:
> Is there any way to work around this in the guest?  Such as using a
> UEFI driver for e1000 or something like that.

Currently OS X boots on top of SeaBIOS and Chameleon, neither of which
know anything about the e1000 hardware. On real hardware, the XNU e1000
driver expects the proprietary BIOS to set things up "just right", and
doesn't have to bother jumping through all the hoops to properly
initialize the hardware from scratch (after all, the XNU driver
developers only have to care about a limited range of carefully
controlled hardware).

In the VM/guest scenario, QEMU is the only piece that has any knowledge
of the e1000 hardware, so having it prep things for the guest would be
the path of least resistance. Using a completely different alternative
to SeaBIOS (one that would/could assume e1000 is present and would know
enough about it to configure it just right) sounds a lot less feasible.

Oh, and it gets worse :) My v1 patch does enough cheating to get OS X
to link up successfully on a e1000-equipped *Q35* VM. If Q35 is not
specified, then another bit of trickery is needed (i.e. inject another
LSC when the driver initially unmasks LSC in the IMS register...

Anyhow, this second version seems to work both with Q35 and without,
on my vanilla SnowLeopard image.

Any more thoughts and ideas much appreciated!

Thanks,
  Gabriel

 hw/net/e1000.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index ae63591..fe0f34e 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -186,6 +186,9 @@ e1000_link_up(E1000State *s)
 s->phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS;
 }
 
+/* Forward decl. for use in set_phy_ctrl() (OS X link nego. workaround) */
+static void set_ics(E1000State *s, int index, uint32_t val);
+
 static void
 set_phy_ctrl(E1000State *s, int index, uint16_t val)
 {
@@ -197,6 +200,15 @@ set_phy_ctrl(E1000State *s, int index, uint16_t val)
 if (!(s->compat_flags & E1000_FLAG_AUTONEG)) {
 return;
 }
+/*
+ * The Mac OS X driver expects a pre-initialized network card; injecting
+ * an extra LSC event here allows initial link negotiation to succeed in
+ * the absence of the Apple EFI BIOS.
+ */
+if ((val & MII_CR_RESET)) {
+set_ics(s, 0, E1000_ICR_LSC);
+return;
+}
 if ((val & MII_CR_AUTO_NEG_EN) && (val & MII_CR_RESTART_AUTO_NEG)) {
 e1000_link_down(s);
 s->phy_reg[PHY_STATUS] &= ~MII_SR_AUTONEG_COMPLETE;
@@ -1159,8 +1171,14 @@ set_imc(E1000State *s, int index, uint32_t val)
 static void
 set_ims(E1000State *s, int index, uint32_t val)
 {
+uint32_t ics_val = 0;
+
+/* When Mac OS X initially unmasks LSC, it expects to see it set in ICS */
+if (s->mac_reg[IMS] == 0 && (val & E1000_IMS_LSC))
+ics_val |= E1000_ICR_LSC;
+
 s->mac_reg[IMS] |= val;
-set_ics(s, 0, 0);
+set_ics(s, 0, ics_val);
 }
 
 #define getreg(x)  [x] = mac_readreg
-- 
1.8.1.4




Re: [Qemu-devel] [PATCH] exec: fix regression by making system-memory region UINT64_MAX size

2013-11-07 Thread Jordan Justen
On Sun, Nov 3, 2013 at 12:48 PM, Marcel Apfelbaum  wrote:
> The commit:
>
> Commit: a53ae8e934cd54686875b5bcfc2f434244ee55d6
> Author: Marcel Apfelbaum 
> Date:   Mon Sep 16 11:21:16 2013 +0300
>
> hw/pci: partially handle pci master abort
>
> introduced a regression on make check:

Laszlo pointed out that my OVMF flash support series was not working
with QEMU master. It was working with QEMU 1.6.0. I then bisected the
issue to this commit. It seems this commit regresses -pflash support
for both KVM and non-KVM modes.

Can you reproduce the issue this with command?
x86_64-softmmu/qemu-system-x86_64 -pflash pc-bios/bios.bin
(with or without adding -enable-kvm)

I tried adding this patch ("exec: fix regression by making
system-memory region UINT64_MAX size") and it did not help the -pflash
regression.

-Jordan



Re: [Qemu-devel] [PATCH 1/2] block: Add bs->node_name to hold the name of a bs node of the bs graph.

2013-11-07 Thread Eric Blake
On 11/07/2013 08:01 AM, Benoît Canet wrote:
> Add the minimum of code to prepare the followings patches.
> 
> If no node_name is provided to bdrv_new the bs->node_name is set to 
> "undefined".
> This will allow to have some default string to communicate in QMP and HMP.
> This also make "undefined" a reserved string for bs->node_name.
> 
> Signed-off-by: Benoit Canet 
> ---

> +/* if node name is given store it in bs and insert bs in the graph bs 
> list
> + * note: undefined is a reserved node name
> + */
> +if (node_name &&
> +node_name[0] != '\0' &&
> +strcmp(node_name, "undefined")) {
> +pstrcpy(bs->node_name, sizeof(bs->node_name), node_name);
> +QTAILQ_INSERT_TAIL(&graph_bdrv_states, bs, node_list);
> +/* else set the bs node name to undefined for QMP and HMP */
> +} else {
> +sprintf(bs->node_name, "undefined");

strcpy() is more efficient than sprintf() with no % in the format string.


>  
> +/* This function is to find a node in the bs graph */
> +BlockDriverState *bdrv_find_node(const char *node_name)
> +{
> +BlockDriverState *bs;
> +

Should this function assert that node_name is not "undefined"?


> +++ b/include/block/block_int.h
> @@ -297,11 +297,18 @@ struct BlockDriverState {
>  BlockdevOnError on_read_error, on_write_error;
>  bool iostatus_enabled;
>  BlockDeviceIoStatus iostatus;
> +
> +/* the following member give a name to every node on the BlockDriverState

s/give/gives/

> + * graph.
> + */
> +char node_name[32];
> +QTAILQ_ENTRY(BlockDriverState) node_list;
> +/* Device name is the name associated with the "drive" the guest see */

s/see/sees/

>  char device_name[32];
> +QTAILQ_ENTRY(BlockDriverState) device_list;

Maybe I missed it, but is there code that ensures there are no duplicate
node names (other than the magic "undefined")?

Seems like it's moving in the right direction, although I'm not sure
it's worth applying this until we know the qapi for working with node
names makes sense.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] util/error: Save errno from clobbering

2013-11-07 Thread Eric Blake
On 11/07/2013 12:10 PM, Max Reitz wrote:
> There may be calls to error_setg() and especially error_setg_errno()
> which blindly (and until now wrongly) assume these functions not to
> clobber errno (e.g., they pass errno to error_setg_errno() and return
> -errno afterwards). Instead of trying to find and fix all of these
> constructs, just make sure error_setg() and error_setg_errno() indeed do
> not clobber errno.
> 
> Suggested-by: Eric Blake 
> Signed-off-by: Max Reitz 
> ---
>  util/error.c | 6 ++
>  1 file changed, 6 insertions(+)

Reviewed-by: Eric Blake 

I did a quick glance through 'git grep -p -A2 error_set' for any more
culprits beyond the one you just fixed in block.c, and didn't spot any
obvious ones; but there were enough 'goto error' statements where I
didn't check if the code at the error label was depending on sane errno
value; and it is definitely easier for code maintenance if you don't
have to think about whether logging an error will clobber the error.
(Not to mention I've also patched libvirt to have the same paradigm of
logging functions guaranteeing no modification to errno).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] qapi-schema: Update description for NewImageMode

2013-11-07 Thread Eric Blake
On 11/07/2013 11:47 AM, Max Reitz wrote:
> If the NewImageMode is "absolute-paths" but no backing file is available
> (e.g., when mirroring a device with an unbacked image), the target image
> will not be backed either. This patch updates the documentation in
> qapi-schema.json accordingly.
> 
> Signed-off-by: Max Reitz 
> ---
> Follow-up to:
>  - block/drive-mirror: Check for NULL backing_hd
> ---
>  qapi-schema.json | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Eric Blake 

Trivial enough to include in 1.7, if desired.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v1] e1000: initial link negotiation on mac osx

2013-11-07 Thread jacek burghardt
I know there are patches for seabios


Re: [Qemu-devel] [PATCH v1] e1000: initial link negotiation on mac osx

2013-11-07 Thread Paolo Bonzini
Il 07/11/2013 19:04, Gabriel L. Somlo ha scritto:
> Some guest operating systems' drivers (particularly Mac OS X)
> expect the link state to be pre-initialized by an earlier
> component such as a proprietary BIOS. This patch injects an
> additional LSC event upon PHY reset, allowing the OS X driver
> to successfully complete initial link negotiation. This is a
> follow-up to commit 372254c6e5c078fb13b236bb648d2b9b2b0c70f1,
> which works around the OS X driver's failure to properly set
> up the MAC address.
> 
> Signed-off-by: Gabriel Somlo 
> ---
> 
> I studied the Intel 8254xxx manual and the various earlier
> suggestions, and came up with the following, which both works
> (at least for me, using SnowLeopard) and also appears to make
> sense from a "code flow" point of view. Please let me know
> what you all think.
> 
> Thanks,
>   Gabriel
> 
>  hw/net/e1000.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index ec8ecd7..2f2fc3a 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -186,6 +186,9 @@ e1000_link_up(E1000State *s)
>  s->phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS;
>  }
>  
> +/* Forward decl. for use in set_phy_ctrl() (OS X link nego. workaround) */
> +static void set_ics(E1000State *s, int index, uint32_t val);
> +
>  static void
>  set_phy_ctrl(E1000State *s, int index, uint16_t val)
>  {
> @@ -197,6 +200,15 @@ set_phy_ctrl(E1000State *s, int index, uint16_t val)
>  if (!(s->compat_flags & E1000_FLAG_AUTONEG)) {
>  return;
>  }
> +/*
> + * The Mac OS X driver expects a pre-initialized network card; injecting
> + * an extra LSC event here allows initial link negotiation to succeed in
> + * the absence of the Apple EFI BIOS.
> + */
> +if ((val & MII_CR_RESET)) {
> +set_ics(s, 0, E1000_ICR_LSC);
> +return;
> +}
>  if ((val & MII_CR_AUTO_NEG_EN) && (val & MII_CR_RESTART_AUTO_NEG)) {
>  e1000_link_down(s);
>  s->phy_reg[PHY_STATUS] &= ~MII_SR_AUTONEG_COMPLETE;
> 

Is there any way to work around this in the guest?  Such as using a UEFI
driver for e1000 or something like that.

Paolo



Re: [Qemu-devel] [PATCH] util/error: Save errno from clobbering

2013-11-07 Thread Benoît Canet
Le Thursday 07 Nov 2013 à 20:10:29 (+0100), Max Reitz a écrit :
> There may be calls to error_setg() and especially error_setg_errno()
> which blindly (and until now wrongly) assume these functions not to
> clobber errno (e.g., they pass errno to error_setg_errno() and return
> -errno afterwards). Instead of trying to find and fix all of these
> constructs, just make sure error_setg() and error_setg_errno() indeed do
> not clobber errno.
> 
> Suggested-by: Eric Blake 
> Signed-off-by: Max Reitz 
> ---
>  util/error.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/util/error.c b/util/error.c
> index ec0faa6..3ee362a 100644
> --- a/util/error.c
> +++ b/util/error.c
> @@ -27,6 +27,7 @@ void error_set(Error **errp, ErrorClass err_class, const 
> char *fmt, ...)
>  {
>  Error *err;
>  va_list ap;
> +int saved_errno = errno;
>  
>  if (errp == NULL) {
>  return;
> @@ -41,6 +42,8 @@ void error_set(Error **errp, ErrorClass err_class, const 
> char *fmt, ...)
>  err->err_class = err_class;
>  
>  *errp = err;
> +
> +errno = saved_errno;
>  }
>  
>  void error_set_errno(Error **errp, int os_errno, ErrorClass err_class,
> @@ -49,6 +52,7 @@ void error_set_errno(Error **errp, int os_errno, ErrorClass 
> err_class,
>  Error *err;
>  char *msg1;
>  va_list ap;
> +int saved_errno = errno;
>  
>  if (errp == NULL) {
>  return;
> @@ -69,6 +73,8 @@ void error_set_errno(Error **errp, int os_errno, ErrorClass 
> err_class,
>  err->err_class = err_class;
>  
>  *errp = err;
> +
> +errno = saved_errno;
>  }
>  
>  void error_setg_file_open(Error **errp, int os_errno, const char *filename)
> -- 
> 1.8.4.2
> 
> 

Yes this look better than trying to fix all callers.

Reviewed-by: Benoit Canet 

[Qemu-devel] Windows 8 installation fails with qemu-1.6.1

2013-11-07 Thread Erik Rull

Hi all,

when starting qemu (with or without kvm active) I get the following error 
code after having a few minutes a blue windows logo on a black background:


Your PC needs to restart.
Please hold down the power button.
Error Code: 0x005C
Parameters:
0x0110
0xFFD09BC8
0x0019
0xC001

With KVM I tried -cpu host, without KVM I emulated a core2duo which fits my 
hardware. ACPI is enabled.


Any hints appreciated.

Thanks.

Best regards,

Erik



Re: [Qemu-devel] [PATCH 0/2] exec: alternative fix for master abort woes

2013-11-07 Thread Paolo Bonzini
Il 07/11/2013 19:54, Michael S. Tsirkin ha scritto:
> On Thu, Nov 07, 2013 at 06:29:40PM +0100, Paolo Bonzini wrote:
>> Il 07/11/2013 17:47, Michael S. Tsirkin ha scritto:
>>> That's on kvm with 52 bit address.
>>> But where I would be concerned is systems with e.g. 36 bit address
>>> space where we are doubling the cost of the lookup.
>>> E.g. try i386 and not x86_64.
>>
>> Tried now...
>>
>> P_L2_LEVELS pre-patch   post-patch
>>i386 3   6
>>x86_64   4   6
>>
>> I timed the inl_from_qemu test of vmexit.flat with both KVM and TCG.  With
>> TCG there's indeed a visible penalty of 20 cycles for i386 and 10 for x86_64
>> (you can extrapolate to 30 cycles for TARGET_PHYS_ADDR_SPACE_BITS=32 
>> targets).
>> These can be more or less entirely ascribed to phys_page_find:
>>
>>  TCG |  KVM
>>pre-patch  post-patch |  pre-patch   post-patch
>> phys_page_find(i386)  13% 25%| 0.6% 1%
>> inl_from_qemu cycles(i386)153 173|   ~12000  ~12000
> 
> I'm a bit confused by the numbers above. The % of phys_page_find has
> grown from 13% to  25% (almost double, which is kind of expected
> give we have twice the # of levels).

Yes.

> But overhead in # of cycles only went from 153 to
> 173?

new cycles / old cycles = 173 / 153 = 113%

% outside phys_page_find + % in phys_page_find*2 = 87% + 13%*2 = 113%

> Maybe the test is a bit wrong for tcg - how about unrolling the
> loop in kvm unit test?

Done that already. :)

>> Also, compiling with "-fstack-protector" instead of "-fstack-protector-all",
>> as suggested a while ago by rth, is already giving a savings of 20 cycles.
> 
> Is it true that with TCG this affects more than just MMIO
> as phys_page_find will also sometimes run on CPU accesses to memory?

Yes.  I tried benchmarking with perf the boot of a RHEL guest, which has

 TCG   | KVM
   pre-patch  post-patch   | pre-patchpost-patch
  3% 5.8%  |0.9% 1.7%

This is actually higher than usual for KVM because there are many VGA
access during GRUB.

>> And of course, if this were a realistic test, KVM's 60x penalty would
>> be a severe problem---but it isn't, because this is not a realistic setting.
> 
> Well, for this argument to carry the day we'd need to design
> a realistic test which isn't easy :)

Yes, I guess the number that matters is the extra 2% penalty for TCG
(the part that doesn't come from MMIO).

Paolo



[Qemu-devel] [PATCH] util/error: Save errno from clobbering

2013-11-07 Thread Max Reitz
There may be calls to error_setg() and especially error_setg_errno()
which blindly (and until now wrongly) assume these functions not to
clobber errno (e.g., they pass errno to error_setg_errno() and return
-errno afterwards). Instead of trying to find and fix all of these
constructs, just make sure error_setg() and error_setg_errno() indeed do
not clobber errno.

Suggested-by: Eric Blake 
Signed-off-by: Max Reitz 
---
 util/error.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/util/error.c b/util/error.c
index ec0faa6..3ee362a 100644
--- a/util/error.c
+++ b/util/error.c
@@ -27,6 +27,7 @@ void error_set(Error **errp, ErrorClass err_class, const char 
*fmt, ...)
 {
 Error *err;
 va_list ap;
+int saved_errno = errno;
 
 if (errp == NULL) {
 return;
@@ -41,6 +42,8 @@ void error_set(Error **errp, ErrorClass err_class, const char 
*fmt, ...)
 err->err_class = err_class;
 
 *errp = err;
+
+errno = saved_errno;
 }
 
 void error_set_errno(Error **errp, int os_errno, ErrorClass err_class,
@@ -49,6 +52,7 @@ void error_set_errno(Error **errp, int os_errno, ErrorClass 
err_class,
 Error *err;
 char *msg1;
 va_list ap;
+int saved_errno = errno;
 
 if (errp == NULL) {
 return;
@@ -69,6 +73,8 @@ void error_set_errno(Error **errp, int os_errno, ErrorClass 
err_class,
 err->err_class = err_class;
 
 *errp = err;
+
+errno = saved_errno;
 }
 
 void error_setg_file_open(Error **errp, int os_errno, const char *filename)
-- 
1.8.4.2




Re: [Qemu-devel] [PATCH] qapi-schema: Update description for NewImageMode

2013-11-07 Thread Benoît Canet
Le Thursday 07 Nov 2013 à 19:47:48 (+0100), Max Reitz a écrit :
> If the NewImageMode is "absolute-paths" but no backing file is available
> (e.g., when mirroring a device with an unbacked image), the target image
> will not be backed either. This patch updates the documentation in
> qapi-schema.json accordingly.
> 
> Signed-off-by: Max Reitz 
> ---
> Follow-up to:
>  - block/drive-mirror: Check for NULL backing_hd
> ---
>  qapi-schema.json | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 81a375b..dde8e45 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -1736,7 +1736,8 @@
>  # @existing: QEMU should look for an existing image file.
>  #
>  # @absolute-paths: QEMU should create a new image with absolute paths
> -# for the backing file.
> +# for the backing file. If there is no backing file available, the new
> +# image will not be backed either.
>  #
>  # Since: 1.1
>  ##
> -- 
> 1.8.4.2
> 
> 
Reviewed-by: Benoit Canet 



Re: [Qemu-devel] [PATCH 0/2] exec: alternative fix for master abort woes

2013-11-07 Thread Michael S. Tsirkin
On Thu, Nov 07, 2013 at 06:29:40PM +0100, Paolo Bonzini wrote:
> Il 07/11/2013 17:47, Michael S. Tsirkin ha scritto:
> > That's on kvm with 52 bit address.
> > But where I would be concerned is systems with e.g. 36 bit address
> > space where we are doubling the cost of the lookup.
> > E.g. try i386 and not x86_64.
> 
> Tried now...
> 
> P_L2_LEVELS pre-patch   post-patch
>i386 3   6
>x86_64   4   6
> 
> I timed the inl_from_qemu test of vmexit.flat with both KVM and TCG.  With
> TCG there's indeed a visible penalty of 20 cycles for i386 and 10 for x86_64
> (you can extrapolate to 30 cycles for TARGET_PHYS_ADDR_SPACE_BITS=32 targets).
> These can be more or less entirely ascribed to phys_page_find:
> 
>  TCG |  KVM
>pre-patch  post-patch |  pre-patch   post-patch
> phys_page_find(i386)  13% 25%| 0.6% 1%
> inl_from_qemu cycles(i386)153 173|   ~12000  ~12000

I'm a bit confused by the numbers above. The % of phys_page_find has
grown from 13% to  25% (almost double, which is kind of expected
give we have twice the # of levels). But overhead in # of cycles only went from 
153 to
173? Maybe the test is a bit wrong for tcg - how about unrolling the
loop in kvm unit test?


diff --git a/x86/vmexit.c b/x86/vmexit.c
index 957d0cc..405d545 100644
--- a/x86/vmexit.c
+++ b/x86/vmexit.c
@@ -40,6 +40,15 @@ static unsigned int inl(unsigned short port)
 {
 unsigned int val;
 asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
+asm volatile("inl %w1, %0" : "=a"(val) : "Nd"(port));
 return val;
 }
 

Then you have to divide the reported result by 10.

> phys_page_find(x86_64)18% 25%| 0.8% 1%
> inl_from_qemu cycles(x86_64)  163 173|   ~12000  ~12000
> 
> Thus this patch costs 0.4% in the worst case for KVM, 12% in the worst case
> for TCG.  The cycle breakdown is:
> 
> 60 phys_page_find
> 28 access_with_adjusted_size
> 24 address_space_translate_internal
> 20 address_space_rw
> 13 io_mem_read
> 11 address_space_translate
>  9 memory_region_read_accessor
>  6 memory_region_access_valid
>  4 helper_inl
>  4 memory_access_size
>  3 cpu_inl
> 
> (This run reported 177 cycles per access; the total is 182 due to rounding).
> It is probably possible to shave at least 10 cycles from the functions below,
> or to make the depth of the tree dynamic so that you would save even more
> compared to 1.6.0.
> 
> Also, compiling with "-fstack-protector" instead of "-fstack-protector-all",
> as suggested a while ago by rth, is already giving a savings of 20 cycles.
> 

Is it true that with TCG this affects more than just MMIO
as phys_page_find will also sometimes run on CPU accesses to memory?

> And of course, if this were a realistic test, KVM's 60x penalty would
> be a severe problem---but it isn't, because this is not a realistic setting.
> 
> Paolo

Well, for this argument to carry the day we'd need to design
a realistic test which isn't easy :)

-- 
MST



Re: [Qemu-devel] How to introduce bs->node_name ?

2013-11-07 Thread Benoît Canet
Le Monday 04 Nov 2013 à 19:33:21 (+0800), Fam Zheng a écrit :
> 
> On 11/04/2013 07:06 PM, Kevin Wolf wrote:
> >Am 04.11.2013 um 10:48 hat Fam Zheng geschrieben:
> >>On 11/04/2013 05:31 PM, Stefan Hajnoczi wrote:
> >>>On Wed, Oct 30, 2013 at 02:49:32PM +0100, Markus Armbruster wrote:
> The first proposal is to add another parameter, say "id".  Users can
> then refer either to an arbitrary BDS by "id", or (for backward
> compatibility) to the root BDS by "device".  When the code sees
> "device", it'll look up the BB, then fetch its root BDS.
> 
> CON: Existing parameter "device" becomes compatibility cruft.
> 
> PRO: Clean and obvious semantics (in my opinion).
> >>>This proposal gets my vote.
> >>>
> The second proposal is to press the existing parameter "device" into
> service for referring to BDS node_name.
> 
> To keep backward compatibility, we obviously need to ensure that
> whenever the old code accepts a value of "device", the new code accepts
> it as well, and both resolve it to the same BDS.
> >>>Different legacy commands given the same device name might need to
> >>>operate on different nodes.
> >>Could you give an example for this?
> >>
> >>
> >>>Dynamic renaming does not solve this
> >>>problem, so I'm not convinced we can always choose a device name
> >>>matching a node name.
> >>>
> >>>Device name commands are higher-level than graph node commands.  For
> >>>example, block_set_io_throttle makes sense on a device but less sense on
> >>>a graph node, unless we add the implicit assumption that the new
> >>>throttling node is created on top of the given node or updated in place
> >>>if the throttling node already exists (!!).
> >>Throttling a node could be useful too, for example if we want to
> >>throttle backing_hd which is on shared storage, but not to throttle
> >>on the local image.
> >>
> >>My ignorant question is: Why can't we just use one namespace, make
> >>sure no name collision between node_name and device_name, or even
> >>just drop device_name, so we treat the root node's node_name as
> >>device_name? For commands that only accept a device, this can be
> >>enforced in its implementation by checking against the whole graph
> >>to verify this.
> >Markus described it somewhere in this thread: Live snapshots.
> >Currently, the device_name moves to the new BDS on the top (and
> >compatibility requires us to keep it that way), whereas a node name
> >should, of course, stay at its node.
> >
> >When you consider this, the single namespace, as much as I would have
> >loved it, is pretty much dead.

Good everyone agree on the direction to take.
I'll write some code.

Best regards

Benoît

> 
> Thanks for explaining (again). I get the reason now.
> 
> Fam
> 



[Qemu-devel] [PATCH] qapi-schema: Update description for NewImageMode

2013-11-07 Thread Max Reitz
If the NewImageMode is "absolute-paths" but no backing file is available
(e.g., when mirroring a device with an unbacked image), the target image
will not be backed either. This patch updates the documentation in
qapi-schema.json accordingly.

Signed-off-by: Max Reitz 
---
Follow-up to:
 - block/drive-mirror: Check for NULL backing_hd
---
 qapi-schema.json | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 81a375b..dde8e45 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1736,7 +1736,8 @@
 # @existing: QEMU should look for an existing image file.
 #
 # @absolute-paths: QEMU should create a new image with absolute paths
-# for the backing file.
+# for the backing file. If there is no backing file available, the new
+# image will not be backed either.
 #
 # Since: 1.1
 ##
-- 
1.8.4.2




[Qemu-devel] [PATCH v1] e1000: initial link negotiation on mac osx

2013-11-07 Thread Gabriel L. Somlo
Some guest operating systems' drivers (particularly Mac OS X)
expect the link state to be pre-initialized by an earlier
component such as a proprietary BIOS. This patch injects an
additional LSC event upon PHY reset, allowing the OS X driver
to successfully complete initial link negotiation. This is a
follow-up to commit 372254c6e5c078fb13b236bb648d2b9b2b0c70f1,
which works around the OS X driver's failure to properly set
up the MAC address.

Signed-off-by: Gabriel Somlo 
---

I studied the Intel 8254xxx manual and the various earlier
suggestions, and came up with the following, which both works
(at least for me, using SnowLeopard) and also appears to make
sense from a "code flow" point of view. Please let me know
what you all think.

Thanks,
  Gabriel

 hw/net/e1000.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index ec8ecd7..2f2fc3a 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -186,6 +186,9 @@ e1000_link_up(E1000State *s)
 s->phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS;
 }
 
+/* Forward decl. for use in set_phy_ctrl() (OS X link nego. workaround) */
+static void set_ics(E1000State *s, int index, uint32_t val);
+
 static void
 set_phy_ctrl(E1000State *s, int index, uint16_t val)
 {
@@ -197,6 +200,15 @@ set_phy_ctrl(E1000State *s, int index, uint16_t val)
 if (!(s->compat_flags & E1000_FLAG_AUTONEG)) {
 return;
 }
+/*
+ * The Mac OS X driver expects a pre-initialized network card; injecting
+ * an extra LSC event here allows initial link negotiation to succeed in
+ * the absence of the Apple EFI BIOS.
+ */
+if ((val & MII_CR_RESET)) {
+set_ics(s, 0, E1000_ICR_LSC);
+return;
+}
 if ((val & MII_CR_AUTO_NEG_EN) && (val & MII_CR_RESTART_AUTO_NEG)) {
 e1000_link_down(s);
 s->phy_reg[PHY_STATUS] &= ~MII_SR_AUTONEG_COMPLETE;
-- 
1.8.1.4




Re: [Qemu-devel] [PATCH 0/2] exec: alternative fix for master abort woes

2013-11-07 Thread Paolo Bonzini
Il 07/11/2013 17:47, Michael S. Tsirkin ha scritto:
> That's on kvm with 52 bit address.
> But where I would be concerned is systems with e.g. 36 bit address
> space where we are doubling the cost of the lookup.
> E.g. try i386 and not x86_64.

Tried now...

P_L2_LEVELS pre-patch   post-patch
   i386 3   6
   x86_64   4   6

I timed the inl_from_qemu test of vmexit.flat with both KVM and TCG.  With
TCG there's indeed a visible penalty of 20 cycles for i386 and 10 for x86_64
(you can extrapolate to 30 cycles for TARGET_PHYS_ADDR_SPACE_BITS=32 targets).
These can be more or less entirely ascribed to phys_page_find:

 TCG |  KVM
   pre-patch  post-patch |  pre-patch   post-patch
phys_page_find(i386)  13% 25%| 0.6% 1%
inl_from_qemu cycles(i386)153 173|   ~12000  ~12000
phys_page_find(x86_64)18% 25%| 0.8% 1%
inl_from_qemu cycles(x86_64)  163 173|   ~12000  ~12000

Thus this patch costs 0.4% in the worst case for KVM, 12% in the worst case
for TCG.  The cycle breakdown is:

60 phys_page_find
28 access_with_adjusted_size
24 address_space_translate_internal
20 address_space_rw
13 io_mem_read
11 address_space_translate
 9 memory_region_read_accessor
 6 memory_region_access_valid
 4 helper_inl
 4 memory_access_size
 3 cpu_inl

(This run reported 177 cycles per access; the total is 182 due to rounding).
It is probably possible to shave at least 10 cycles from the functions below,
or to make the depth of the tree dynamic so that you would save even more
compared to 1.6.0.

Also, compiling with "-fstack-protector" instead of "-fstack-protector-all",
as suggested a while ago by rth, is already giving a savings of 20 cycles.

And of course, if this were a realistic test, KVM's 60x penalty would
be a severe problem---but it isn't, because this is not a realistic setting.

Paolo



Re: [Qemu-devel] [PATCH 0/2] exec: alternative fix for master abort woes

2013-11-07 Thread Michael S. Tsirkin
On Thu, Nov 07, 2013 at 05:29:15PM +0100, Paolo Bonzini wrote:
> Il 07/11/2013 17:21, Michael S. Tsirkin ha scritto:
> >> > This fixes the problems with the misalignment of the master abort region.
> >> > See patch 2 for details, patch 1 is just a preparatory search-and-replace
> >> > patch.
> >> > 
> >> > Paolo Bonzini (2):
> >> >   split definitions for exec.c and translate-all.c radix trees
> >> >   exec: make address spaces 64-bit wide
> > Can you please share info on testing you did?
> > 
> 
> "make check", booting a RHEL guest with both KVM and TCG, Luiz's gdb
> crash.  I also ran vmexit.flat from kvm-unit-tests and checked that
> there was no measurable slowdown.
> 
> Paolo

That's on kvm with 52 bit address.
But where I would be concerned is systems with e.g. 36 bit address
space where we are doubling the cost of the lookup.
E.g. try i386 and not x86_64.

-- 
But



Re: [Qemu-devel] [PATCH 0/2] exec: alternative fix for master abort woes

2013-11-07 Thread Paolo Bonzini
Il 07/11/2013 17:21, Michael S. Tsirkin ha scritto:
>> > This fixes the problems with the misalignment of the master abort region.
>> > See patch 2 for details, patch 1 is just a preparatory search-and-replace
>> > patch.
>> > 
>> > Paolo Bonzini (2):
>> >   split definitions for exec.c and translate-all.c radix trees
>> >   exec: make address spaces 64-bit wide
> Can you please share info on testing you did?
> 

"make check", booting a RHEL guest with both KVM and TCG, Luiz's gdb
crash.  I also ran vmexit.flat from kvm-unit-tests and checked that
there was no measurable slowdown.

Paolo



Re: [Qemu-devel] [PATCH 0/2] exec: alternative fix for master abort woes

2013-11-07 Thread Michael S. Tsirkin
On Thu, Nov 07, 2013 at 05:14:35PM +0100, Paolo Bonzini wrote:
> This fixes the problems with the misalignment of the master abort region.
> See patch 2 for details, patch 1 is just a preparatory search-and-replace
> patch.
> 
> Paolo Bonzini (2):
>   split definitions for exec.c and translate-all.c radix trees
>   exec: make address spaces 64-bit wide

Can you please share info on testing you did?

>  exec.c  | 28 
>  translate-all.c | 32 ++--
>  translate-all.h |  7 ---
>  3 files changed, 34 insertions(+), 33 deletions(-)
> 
> -- 
> 1.8.4.2



[Qemu-devel] [PATCH 2/2] exec: make address spaces 64-bit wide

2013-11-07 Thread Paolo Bonzini
As an alternative to commit 818f86b (exec: limit system memory
size, 2013-11-04) let's just make all address spaces 64-bit wide.
This eliminates problems with phys_page_find ignoring bits above
TARGET_PHYS_ADDR_SPACE_BITS and address_space_translate_internal
consequently messing up the computations.

In Luiz's reported crash, at startup gdb attempts to read from address
0xffe6 to 0x inclusive.  The region it gets
is the newly introduced master abort region, which is as big as the PCI
address space (see pci_bus_init).  Due to a typo that's only 2^63-1,
not 2^64.  But we get it anyway because phys_page_find ignores the upper
bits of the physical address.  In address_space_translate_internal then

diff = int128_sub(section->mr->size, int128_make64(addr));
*plen = int128_get64(int128_min(diff, int128_make64(*plen)));

diff becomes negative, and int128_get64 booms.

The size of the PCI address space region should be fixed anyway.

Reported-by: Luiz Capitulino 
Signed-off-by: Paolo Bonzini 
---
 exec.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 9e2fc4b..d5ce3da 100644
--- a/exec.c
+++ b/exec.c
@@ -89,7 +89,7 @@ struct PhysPageEntry {
 };
 
 /* Size of the L2 (and L3, etc) page tables.  */
-#define ADDR_SPACE_BITS TARGET_PHYS_ADDR_SPACE_BITS
+#define ADDR_SPACE_BITS 64
 
 #define P_L2_BITS 10
 #define P_L2_SIZE (1 << P_L2_BITS)
@@ -1750,11 +1750,7 @@ static void memory_map_init(void)
 {
 system_memory = g_malloc(sizeof(*system_memory));
 
-assert(ADDR_SPACE_BITS <= 64);
-
-memory_region_init(system_memory, NULL, "system",
-   ADDR_SPACE_BITS == 64 ?
-   UINT64_MAX : (0x1ULL << ADDR_SPACE_BITS));
+memory_region_init(system_memory, NULL, "system", UINT64_MAX);
 address_space_init(&address_space_memory, system_memory, "memory");
 
 system_io = g_malloc(sizeof(*system_io));
-- 
1.8.4.2




[Qemu-devel] [PATCH 1/2] split definitions for exec.c and translate-all.c radix trees

2013-11-07 Thread Paolo Bonzini
The exec.c and translate-all.c radix trees are quite different, and
the exec.c one in particular is not limited to the CPU---it can be
used also by devices that do DMA, and in that case the address space
is not limited to TARGET_PHYS_ADDR_SPACE_BITS bits.

We want to make exec.c's radix trees 64-bit wide.  As a first step,
stop sharing the constants between exec.c and translate-all.c.
exec.c gets P_L2_* constants, translate-all.c gets V_L2_*, for
consistency with the existing V_L1_* symbols.  Though actually
in the softmmu case translate-all.c is also indexed by physical
addresses...

This patch has no semantic change.

Signed-off-by: Paolo Bonzini 
---
 exec.c  | 28 ++--
 translate-all.c | 32 ++--
 translate-all.h |  7 ---
 3 files changed, 36 insertions(+), 31 deletions(-)

diff --git a/exec.c b/exec.c
index 79610ce..9e2fc4b 100644
--- a/exec.c
+++ b/exec.c
@@ -88,7 +88,15 @@ struct PhysPageEntry {
 uint16_t ptr : 15;
 };
 
-typedef PhysPageEntry Node[L2_SIZE];
+/* Size of the L2 (and L3, etc) page tables.  */
+#define ADDR_SPACE_BITS TARGET_PHYS_ADDR_SPACE_BITS
+
+#define P_L2_BITS 10
+#define P_L2_SIZE (1 << P_L2_BITS)
+
+#define P_L2_LEVELS (((ADDR_SPACE_BITS - TARGET_PAGE_BITS - 1) / P_L2_BITS) + 
1)
+
+typedef PhysPageEntry Node[P_L2_SIZE];
 
 struct AddressSpaceDispatch {
 /* This is a multi-level map on the physical address space.
@@ -155,7 +163,7 @@ static uint16_t phys_map_node_alloc(void)
 ret = next_map.nodes_nb++;
 assert(ret != PHYS_MAP_NODE_NIL);
 assert(ret != next_map.nodes_nb_alloc);
-for (i = 0; i < L2_SIZE; ++i) {
+for (i = 0; i < P_L2_SIZE; ++i) {
 next_map.nodes[ret][i].is_leaf = 0;
 next_map.nodes[ret][i].ptr = PHYS_MAP_NODE_NIL;
 }
@@ -168,13 +176,13 @@ static void phys_page_set_level(PhysPageEntry *lp, hwaddr 
*index,
 {
 PhysPageEntry *p;
 int i;
-hwaddr step = (hwaddr)1 << (level * L2_BITS);
+hwaddr step = (hwaddr)1 << (level * P_L2_BITS);
 
 if (!lp->is_leaf && lp->ptr == PHYS_MAP_NODE_NIL) {
 lp->ptr = phys_map_node_alloc();
 p = next_map.nodes[lp->ptr];
 if (level == 0) {
-for (i = 0; i < L2_SIZE; i++) {
+for (i = 0; i < P_L2_SIZE; i++) {
 p[i].is_leaf = 1;
 p[i].ptr = PHYS_SECTION_UNASSIGNED;
 }
@@ -182,9 +190,9 @@ static void phys_page_set_level(PhysPageEntry *lp, hwaddr 
*index,
 } else {
 p = next_map.nodes[lp->ptr];
 }
-lp = &p[(*index >> (level * L2_BITS)) & (L2_SIZE - 1)];
+lp = &p[(*index >> (level * P_L2_BITS)) & (P_L2_SIZE - 1)];
 
-while (*nb && lp < &p[L2_SIZE]) {
+while (*nb && lp < &p[P_L2_SIZE]) {
 if ((*index & (step - 1)) == 0 && *nb >= step) {
 lp->is_leaf = true;
 lp->ptr = leaf;
@@ -218,7 +226,7 @@ static MemoryRegionSection *phys_page_find(PhysPageEntry 
lp, hwaddr index,
 return §ions[PHYS_SECTION_UNASSIGNED];
 }
 p = nodes[lp.ptr];
-lp = p[(index >> (i * L2_BITS)) & (L2_SIZE - 1)];
+lp = p[(index >> (i * P_L2_BITS)) & (P_L2_SIZE - 1)];
 }
 return §ions[lp.ptr];
 }
@@ -1742,11 +1750,11 @@ static void memory_map_init(void)
 {
 system_memory = g_malloc(sizeof(*system_memory));
 
-assert(TARGET_PHYS_ADDR_SPACE_BITS <= 64);
+assert(ADDR_SPACE_BITS <= 64);
 
 memory_region_init(system_memory, NULL, "system",
-   TARGET_PHYS_ADDR_SPACE_BITS == 64 ?
-   UINT64_MAX : (0x1ULL << TARGET_PHYS_ADDR_SPACE_BITS));
+   ADDR_SPACE_BITS == 64 ?
+   UINT64_MAX : (0x1ULL << ADDR_SPACE_BITS));
 address_space_init(&address_space_memory, system_memory, "memory");
 
 system_io = g_malloc(sizeof(*system_io));
diff --git a/translate-all.c b/translate-all.c
index aeda54d..1c63d78 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -96,12 +96,16 @@ typedef struct PageDesc {
 # define L1_MAP_ADDR_SPACE_BITS  TARGET_VIRT_ADDR_SPACE_BITS
 #endif
 
+/* Size of the L2 (and L3, etc) page tables.  */
+#define V_L2_BITS 10
+#define V_L2_SIZE (1 << V_L2_BITS)
+
 /* The bits remaining after N lower levels of page tables.  */
 #define V_L1_BITS_REM \
-((L1_MAP_ADDR_SPACE_BITS - TARGET_PAGE_BITS) % L2_BITS)
+((L1_MAP_ADDR_SPACE_BITS - TARGET_PAGE_BITS) % V_L2_BITS)
 
 #if V_L1_BITS_REM < 4
-#define V_L1_BITS  (V_L1_BITS_REM + L2_BITS)
+#define V_L1_BITS  (V_L1_BITS_REM + V_L2_BITS)
 #else
 #define V_L1_BITS  V_L1_BITS_REM
 #endif
@@ -395,18 +399,18 @@ static PageDesc *page_find_alloc(tb_page_addr_t index, 
int alloc)
 lp = l1_map + ((index >> V_L1_SHIFT) & (V_L1_SIZE - 1));
 
 /* Level 2..N-1.  */
-for (i = V_L1_SHIFT / L2_BITS - 1; i > 0; i--) {
+for (i = V_L1_SHIFT / V_L2_BITS - 1; i > 0; i--) {
 void **p = *lp;
 
 if (p == NULL) {
 if (!alloc) {
 return NULL;
 }
-  

[Qemu-devel] [PATCH 0/2] exec: alternative fix for master abort woes

2013-11-07 Thread Paolo Bonzini
This fixes the problems with the misalignment of the master abort region.
See patch 2 for details, patch 1 is just a preparatory search-and-replace
patch.

Paolo Bonzini (2):
  split definitions for exec.c and translate-all.c radix trees
  exec: make address spaces 64-bit wide

 exec.c  | 28 
 translate-all.c | 32 ++--
 translate-all.h |  7 ---
 3 files changed, 34 insertions(+), 33 deletions(-)

-- 
1.8.4.2




Re: [Qemu-devel] Does QEMU support MIPS SMP2 malta board?

2013-11-07 Thread Edgar E. Iglesias
On Thu, Oct 31, 2013 at 11:47:12AM +0800, Nancy wrote:
> Hi,
> 
> Does QEMU support MIPS smp 2 malta board?
> 
> $ qemu-system-mipsel -M malta -kernel vmlinux -initrd  ramfs.cpio.gz \
> -append "console=ttyS0 root=/dev/ram0 maxcpus=2"  -nographic -s \
> -smp 2
>



>
> 
> There's no problem while I removing "-smp 2" option. (plus the kernel
> MIPS configure file " malta_defconfig" support smp 2 )


Hi,

Yes it should work.
Can you please try again with -cpu 34Kf?

Cheers,
Edgar



Re: [Qemu-devel] Multi-head support RFC

2013-11-07 Thread John Baboval

On 11/7/2013 8:46 AM, Gerd Hoffmann wrote:

   Hi,


As far as the EDID is concerned, there can only be one EDID for a
display+hw pair, or the guest won't know what to do. In my use-case, I
simply pass real EDIDs through, and create a full-screen window for each
real monitor.

Ok, makes sense.


If you wanted to have two UIs displaying the same
DisplaySurface, the EDID would have to come from one of them, and the
other would have to clip, or scale.

Yes.


Why not?  That is exactly my plan.  Just have the virtual graphic card
call graphic_console_init() multiple times, once for each display
connector it has.

Do you see fundamental issues with that approach?

Currently only one QemuConsole is active at a time, so that would have
to change

That isn't mandatory any more.  It is still the default behavior of a
DisplayChangeListener to follow the active_console, for compatibility
reasons.  SDL and VNC still behave that way.

You can explicitly bind a DisplayChangeListener to a QemuConsole though,
by setting DisplayChangeListener->con before calling
register_displaychangelistener().

gtk binds to QemuConsole #0.

spice creates a display channel per (graphical) console.  Each display
channel has a DisplayChangeListener instance, and each
DisplayChangeListener is linked to a different QemuConsole.

For your UI you probably want follow the spice model.  Have a
DisplayChangeListener for each physical monitor of the host, have a
fixed QemuConsole bound to each DisplayChangeListener.
DisplayChangeListeners can come and go at runtime just fine, so you
should be able to create/destroy them on monitor plug/unplug events on
the host.


I think the best thing for me to do at this point is to just start 
implementing with multiple QemuConsole, then, and post again when I have 
a better idea of what it ends up looking like. I'll keep you posted.




cheers,
   Gerd







[Qemu-devel] [Bug 1248427] Re: whether qemu-img convert will support "-s" option in the newer version

2013-11-07 Thread Serge Hallyn
** Changed in: qemu
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1248427

Title:
  whether qemu-img convert will support "-s" option in the newer version

Status in QEMU:
  Invalid

Bug description:
  In ubuntu os, my qemu-img version is 0.12.0 and its help message show that 
"qemu-img convert" command supports "-s" option, but it disappeared in qemu-img 
0.12.1. However, though it disappeared, it does support in RHEL6.4, and the rpm 
version is qemu-img-0.12.1.2-2.355.el6.x86_64. But in RHEL6.5Beta, the same 
version 0.12.1, if use "qemu-img convert -s", it reports a not support '-s' 
option error message. The rpm version is qemu-img-0.12.1.2-2.402.el6.x86_64. 
  And because of this issue, the openstack "nova image-create" will raise an 
error. Because it uses this command with "-s" option.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1248427/+subscriptions



Re: [Qemu-devel] Questions about Spice pv domUs

2013-11-07 Thread Fabio Fantoni

Il 07/11/2013 16:25, Stefano Stabellini ha scritto:

On Thu, 7 Nov 2013, Fabio Fantoni wrote:

Il 06/11/2013 18:16, Stefano Stabellini ha scritto:

On Tue, 5 Nov 2013, Fabio Fantoni wrote:

Il 30/09/2013 16:56, Fabio Fantoni ha scritto:

I'm trying to implement basic spice support on xen pv domUs.

Test seems ok on Ubuntu 12.04 pv domU except mouse which is not visible.
I also tried agent-mouse=off on qemu spice options but is not working or
maybe spicy (from spice-gtk 0.20) has problem in this case (option to
grab
mouse is already enabled).
I can't add vdagent for now on pv because it hasn't  pci support.
Are there xen parts which may give problem with mouse or couldn't be a
xen
related problem?

Given that PCI and USB buses are both missing in PV guests, I guess that
the issue might be that spice assumes that the mouse is somehow emulated
by a USB device? I think it could be difficult to disentangle spice
support from usb/pci.
You could try to run only the mouse part of the xenfb protocol to get
mouse support.

BTW where are you running the spice backend? Is it a standalone daemon?


For now I did only fast test forcing qemu parameters for pv dom's on libxl
code with the same spice paramters used for hvm domUs.
On first test was not working, then I added xenfb vga, and the screen works
but mouse not.

Is it possible that you are actually not using spice at all, but just
using straight xenfb?



Yes, is spice on qemu, simply with xenfb as vga instead of stdvga that I 
use with hvm domUs for now.
About qxl vga (optimal for increase spice performance and more 
resolutions support), seems that requires other modifications/fix xen 
side that are not able to do :(

Latest debug that I tried if can be useful:
http://lists.xen.org/archives/html/xen-devel/2013-10/msg00016.html
Waiting for news about it I'm trying to implement at least the basic 
spice support on pv for use spice for both hvm and pv (atleast the part 
of domUs maintenance).




Qemu parameters on my test was:

libxl: debug: libxl_dm.c:1282:libxl__spawn_local_dm: Spawning
device-model
/usr/lib/xen/bin/qemu-system-i386 with arguments:
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
/usr/lib/xen/bin/qemu-system-i386
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -xen-domid
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   19
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-19,server,nowait
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -mon
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
chardev=libxl-cmd,mode=control
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -nodefaults
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -xen-attach
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -name
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   PRECISE
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -k
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   it
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -spice
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
port=6002,tls-port=0,addr=0.0.0.0,disable-ticketing,agent-mouse=off
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -vga
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   xenfb
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -M
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   xenpv
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -m
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   1025

I have also another question for qemu developers: I tried to change qemu
-vga parameter to -device but isn't working and I not found nothing on
docs
or man. Is xenfb available with new qemu parameter -device?

As I replied in the other email, xenfb is configured and initialized via
xenstore. Why do you want a command line parameter for it?

I tried to do fast greps without find active code about xenfb.
Can you tell me what I must search for find the new part about xenfb please?

The xenfb code is here:

hw/display/xenfb.c

It is registered here:

hw/i386/xen_machine_pv.c:xen_init_pv


Thanks, then I must search on qemu code what xenstore parameters enable 
xenfb and after search on xen, right?




Re: [Qemu-devel] [PATCH for-1.7 v2 1/8] exec: declare TARGET_PHYS_ADDR_SPACE_MAX to limit memory regions rendered by exec

2013-11-07 Thread Michael S. Tsirkin
On Thu, Nov 07, 2013 at 02:49:06PM +0100, Paolo Bonzini wrote:
> Il 07/11/2013 13:04, Michael S. Tsirkin ha scritto:
> >> > (it follows from this that using the #define anywhere
> >> > except in a memory_region_init() call is probably a bug)
> >> > 
> >> > -- PMM
> > 
> > BTW how about we change the API to pass in int128?
> 
> The vast majority of memory regions have small sizes, it would add
> boilerplate to wrap the size with an int128_make64 everywhere.
> 
> Paolo

Well let's have two APIs for 64 bit and for 128 bit.

> > Not for 1.7 of course.
> > 
> > This will help make sure it's only used for MRs.



[Qemu-devel] [PATCH 1/2] block: Add bs->node_name to hold the name of a bs node of the bs graph.

2013-11-07 Thread Benoît Canet
Add the minimum of code to prepare the followings patches.

If no node_name is provided to bdrv_new the bs->node_name is set to "undefined".
This will allow to have some default string to communicate in QMP and HMP.
This also make "undefined" a reserved string for bs->node_name.

Signed-off-by: Benoit Canet 
---
 block.c   | 70 +++
 block/blkverify.c |  2 +-
 block/iscsi.c |  2 +-
 block/vmdk.c  |  2 +-
 block/vvfat.c |  4 +--
 blockdev.c|  8 +++---
 hw/block/xen_disk.c   |  2 +-
 include/block/block.h |  3 +-
 include/block/block_int.h |  9 +-
 qemu-img.c|  6 ++--
 qemu-io.c |  2 +-
 qemu-nbd.c|  2 +-
 12 files changed, 77 insertions(+), 35 deletions(-)

diff --git a/block.c b/block.c
index fd05a80..230e71a 100644
--- a/block.c
+++ b/block.c
@@ -89,6 +89,9 @@ static int coroutine_fn 
bdrv_co_do_write_zeroes(BlockDriverState *bs,
 static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
 QTAILQ_HEAD_INITIALIZER(bdrv_states);
 
+static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
+QTAILQ_HEAD_INITIALIZER(graph_bdrv_states);
+
 static QLIST_HEAD(, BlockDriver) bdrv_drivers =
 QLIST_HEAD_INITIALIZER(bdrv_drivers);
 
@@ -318,14 +321,26 @@ void bdrv_register(BlockDriver *bdrv)
 }
 
 /* create a new block device (by default it is empty) */
-BlockDriverState *bdrv_new(const char *device_name)
+BlockDriverState *bdrv_new(const char *device_name, const char *node_name)
 {
 BlockDriverState *bs;
 
 bs = g_malloc0(sizeof(BlockDriverState));
 pstrcpy(bs->device_name, sizeof(bs->device_name), device_name);
 if (device_name[0] != '\0') {
-QTAILQ_INSERT_TAIL(&bdrv_states, bs, list);
+QTAILQ_INSERT_TAIL(&bdrv_states, bs, device_list);
+}
+/* if node name is given store it in bs and insert bs in the graph bs list
+ * note: undefined is a reserved node name
+ */
+if (node_name &&
+node_name[0] != '\0' &&
+strcmp(node_name, "undefined")) {
+pstrcpy(bs->node_name, sizeof(bs->node_name), node_name);
+QTAILQ_INSERT_TAIL(&graph_bdrv_states, bs, node_list);
+/* else set the bs node name to undefined for QMP and HMP */
+} else {
+sprintf(bs->node_name, "undefined");
 }
 bdrv_iostatus_disable(bs);
 notifier_list_init(&bs->close_notifiers);
@@ -870,7 +885,7 @@ int bdrv_file_open(BlockDriverState **pbs, const char 
*filename,
 options = qdict_new();
 }
 
-bs = bdrv_new("");
+bs = bdrv_new("", NULL);
 bs->options = options;
 options = qdict_clone_shallow(options);
 
@@ -992,7 +1007,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
sizeof(backing_filename));
 }
 
-bs->backing_hd = bdrv_new("");
+bs->backing_hd = bdrv_new("", NULL);
 
 if (bs->backing_format[0] != '\0') {
 back_drv = bdrv_find_format(bs->backing_format);
@@ -1062,7 +1077,7 @@ int bdrv_open(BlockDriverState *bs, const char *filename, 
QDict *options,
instead of opening 'filename' directly */
 
 /* if there is a backing file, use it */
-bs1 = bdrv_new("");
+bs1 = bdrv_new("", NULL);
 ret = bdrv_open(bs1, filename, NULL, 0, drv, &local_err);
 if (ret < 0) {
 bdrv_unref(bs1);
@@ -1495,7 +1510,7 @@ void bdrv_close_all(void)
 {
 BlockDriverState *bs;
 
-QTAILQ_FOREACH(bs, &bdrv_states, list) {
+QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
 bdrv_close(bs);
 }
 }
@@ -1524,7 +1539,7 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
 static bool bdrv_requests_pending_all(void)
 {
 BlockDriverState *bs;
-QTAILQ_FOREACH(bs, &bdrv_states, list) {
+QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
 if (bdrv_requests_pending(bs)) {
 return true;
 }
@@ -1554,7 +1569,7 @@ void bdrv_drain_all(void)
 /* FIXME: We do not have timer support here, so this is effectively
  * a busy wait.
  */
-QTAILQ_FOREACH(bs, &bdrv_states, list) {
+QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
 if (bdrv_start_throttled_reqs(bs)) {
 busy = true;
 }
@@ -1570,7 +1585,7 @@ void bdrv_drain_all(void)
 void bdrv_make_anon(BlockDriverState *bs)
 {
 if (bs->device_name[0] != '\0') {
-QTAILQ_REMOVE(&bdrv_states, bs, list);
+QTAILQ_REMOVE(&bdrv_states, bs, device_list);
 }
 bs->device_name[0] = '\0';
 }
@@ -1626,7 +1641,12 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 /* keep the same entry in bdrv_states */
 pstrcpy(bs_dest->device_name, sizeof(bs_dest->device_name),
 bs_src->device_name);
-bs_dest->list = bs_src->list;
+bs_dest->device_list = bs_src->device_list;
+
+/* keep the same entry in 

Re: [Qemu-devel] [RFC PATCH] pc: align gpa<->hpa on 1GB boundary by splitting RAM on several regions

2013-11-07 Thread Igor Mammedov
On Wed, 30 Oct 2013 21:44:12 -0200
Marcelo Tosatti  wrote:

> On Wed, Oct 30, 2013 at 08:56:17PM +0100, Igor Mammedov wrote:
> > On Wed, 30 Oct 2013 16:51:29 -0200
> > Marcelo Tosatti  wrote:
> > 
[...]
> 
> > btw: now if QEMU can't allocate hugepages for whole guest size it will 
> > fallback
> > to 4k pages anyway for whole guest size, with warning that isn't visible if
> > user doesn't start QEMU manually.
> 
> Not with -mem-prealloc.
tested with it, it does not much, just prints error message to stderr
and then fallbacks to default allocator continuing guest booting.

> 
> 




Re: [Qemu-devel] Questions about Spice pv domUs

2013-11-07 Thread Stefano Stabellini
On Thu, 7 Nov 2013, Fabio Fantoni wrote:
> Il 06/11/2013 18:16, Stefano Stabellini ha scritto:
> > On Tue, 5 Nov 2013, Fabio Fantoni wrote:
> > > Il 30/09/2013 16:56, Fabio Fantoni ha scritto:
> > > > I'm trying to implement basic spice support on xen pv domUs.
> > > > 
> > > > Test seems ok on Ubuntu 12.04 pv domU except mouse which is not visible.
> > > > I also tried agent-mouse=off on qemu spice options but is not working or
> > > > maybe spicy (from spice-gtk 0.20) has problem in this case (option to
> > > > grab
> > > > mouse is already enabled).
> > > > I can't add vdagent for now on pv because it hasn't  pci support.
> > > > Are there xen parts which may give problem with mouse or couldn't be a
> > > > xen
> > > > related problem?
> > Given that PCI and USB buses are both missing in PV guests, I guess that
> > the issue might be that spice assumes that the mouse is somehow emulated
> > by a USB device? I think it could be difficult to disentangle spice
> > support from usb/pci.
> > You could try to run only the mouse part of the xenfb protocol to get
> > mouse support.
> > 
> > BTW where are you running the spice backend? Is it a standalone daemon?
> > 
> 
> For now I did only fast test forcing qemu parameters for pv dom's on libxl
> code with the same spice paramters used for hvm domUs.
> On first test was not working, then I added xenfb vga, and the screen works
> but mouse not.

Is it possible that you are actually not using spice at all, but just
using straight xenfb?


> > > > Qemu parameters on my test was:
> > > > > libxl: debug: libxl_dm.c:1282:libxl__spawn_local_dm: Spawning
> > > > > device-model
> > > > > /usr/lib/xen/bin/qemu-system-i386 with arguments:
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
> > > > > /usr/lib/xen/bin/qemu-system-i386
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -xen-domid
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   19
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -chardev
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
> > > > > socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-19,server,nowait
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -mon
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
> > > > > chardev=libxl-cmd,mode=control
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -nodefaults
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -xen-attach
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -name
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   PRECISE
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -k
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   it
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -spice
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
> > > > > port=6002,tls-port=0,addr=0.0.0.0,disable-ticketing,agent-mouse=off
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -vga
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   xenfb
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -M
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   xenpv
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -m
> > > > > libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   1025
> > > > I have also another question for qemu developers: I tried to change qemu
> > > > -vga parameter to -device but isn't working and I not found nothing on
> > > > docs
> > > > or man. Is xenfb available with new qemu parameter -device?
> > As I replied in the other email, xenfb is configured and initialized via
> > xenstore. Why do you want a command line parameter for it?
> 
> I tried to do fast greps without find active code about xenfb.
> Can you tell me what I must search for find the new part about xenfb please?

The xenfb code is here:

hw/display/xenfb.c

It is registered here:

hw/i386/xen_machine_pv.c:xen_init_pv



Re: [Qemu-devel] i386: pc: align gpa<->hpa on 1GB boundary (v4)

2013-11-07 Thread Igor Mammedov
On Wed, 6 Nov 2013 19:31:19 -0200
Marcelo Tosatti  wrote:

> 
> v2: condition enablement of new mapping to new machine types (Paolo)
> v3: fix changelog
> v4: rebase
> 
> -
> 
> 
> Align guest physical address and host physical address
> beyond guest 4GB on a 1GB boundary.
> 
> Otherwise 1GB TLBs cannot be cached for the range.
> 
> Signed-off-by: Marcelo Tosatti 
> 
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 12c436e..d29196d 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1156,7 +1156,7 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory,
>  {
>  int linux_boot, i;
>  MemoryRegion *ram, *option_rom_mr;
> -MemoryRegion *ram_below_4g, *ram_above_4g;
> +MemoryRegion *ram_below_4g, *ram_above_4g, *ram_above_4g_piecetwo;
>  FWCfgState *fw_cfg;
>  
>  linux_boot = (kernel_filename != NULL);
> @@ -1177,10 +1177,45 @@ FWCfgState *pc_memory_init(MemoryRegion 
> *system_memory,
>  e820_add_entry(0, below_4g_mem_size, E820_RAM);
>  if (above_4g_mem_size > 0) {
>  ram_above_4g = g_malloc(sizeof(*ram_above_4g));
> -memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
> - below_4g_mem_size, above_4g_mem_size);
> -memory_region_add_subregion(system_memory, 0x1ULL,
> +/*
> + *
> + * If 1GB hugepages are used to back guest RAM, map guest address
> + * space in the range [ramsize,ramsize+holesize] to the ram block
> + * range [holestart, 4GB]
> + *
> + *  0  h 4G 
> [ramsize,ramsize+holesize]
> + *
> + * guest-addr-space [  ] [  ][xxx]
> + */--/
> + * contiguous-ram-block [  ][xxx][ ]
> + *
> + * So that memory beyond 4GB is aligned on a 1GB boundary,
> + * at the host physical address space.
> + *
> + */
I did some corner cases testing and there is alias overlapping in case 
  -m 4096 -mem-path /var/lib/hugetlbfs/global/pagesize-1GB

  0001-00011fff (prio 0, RW): alias ram-above-4g-piecetwo 
@pc.ram e000-
  0001-0001 (prio 0, RW): alias ram-above-4g @pc.ram 
0001-0001

perhaps zero sized ram-above-4g shouldn't be created at all?

in addition ram-above-4g-piecetwo starts at half page offset e000 
but guest sees it 4Gb offset,
wouldn't it cause the same issue patch tries to solve but only for 
ram-above-4g-piecetwo tail sync host/guest offsets
are not 1Gb aligned?

there is more misalignment with:
 -m 4097 -mem-path /var/lib/hugetlbfs/global/pagesize-1GB

  0001-0001000f (prio 0, RW): alias ram-above-4g @pc.ram 
0001-0001000f
  00010010-0001200f (prio 0, RW): alias ram-above-4g-piecetwo 
@pc.ram e000-

where ram-above-4g-piecetwo is aligned with 1Gb+1Mb GPA offset, in addition to 
500Mb offset of HPA.
which would cause the same issue as above prehaps?

> +if (guest_info->gb_align) {
> +unsigned long holesize = 0x1ULL - below_4g_mem_size;
> +
> +memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
> +0x1ULL,
> +above_4g_mem_size - holesize);
> +memory_region_add_subregion(system_memory, 0x1ULL,
>  ram_above_4g);
> +
> +ram_above_4g_piecetwo = g_malloc(sizeof(*ram_above_4g_piecetwo));
> +memory_region_init_alias(ram_above_4g_piecetwo, NULL,
> + "ram-above-4g-piecetwo", ram,
> + 0x1ULL - holesize, holesize);
> +memory_region_add_subregion(system_memory,
> +0x1ULL +
> +above_4g_mem_size - holesize,
> +ram_above_4g_piecetwo);
> +} else {
> +memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
> +below_4g_mem_size, above_4g_mem_size);
> +memory_region_add_subregion(system_memory, 0x1ULL,
> +ram_above_4g);
> +}
>  e820_add_entry(0x1ULL, above_4g_mem_size, E820_RAM);
>  }
>  
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 4fdb7b6..686736e 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -60,6 +60,7 @@ static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
>  static bool has_pvpanic;
>  static bool has_pci_info = true;
>  static bool has_acpi_build = true;
> +static bool gb_align = true;
>  
>  /* PC hardware initialisation */
>  static void pc_init1(QEMUMachineInitArgs *args,
> @@ -128,6 +

Re: [Qemu-devel] [PATCH for-1.7 v2 0/8] fix address space size issue

2013-11-07 Thread Michael S. Tsirkin
On Thu, Nov 07, 2013 at 04:07:34PM +0100, Paolo Bonzini wrote:
> Il 07/11/2013 16:05, Michael S. Tsirkin ha scritto:
> >> > So, ack for patch 5-7-8, which should also be enough to fix the problem
> >> > that Luiz reported.
> > Not at all. As long as exec.c ignores high bits, any access
> > there will end up in the wrong region.
> 
> ... unless it happens to be the correct region because the whole
> 2^64-bit range is covered by a single region.  This is what happens for
> IOMMUs.
> 
> Paolo

Only if you are lucky.
What makes sure ignoring high bits does not hit the e1000 MMIO?
Nothing I think.

> > We might not get a crash but we'll get guest memory corruption.



Re: [Qemu-devel] [PATCH] e1000/rtl8139: update HMP NIC when every bit is written

2013-11-07 Thread Michael S. Tsirkin
On Thu, Nov 07, 2013 at 07:33:57AM -0700, Alex Williamson wrote:
> On Thu, 2013-11-07 at 12:26 +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 07, 2013 at 03:32:29PM +0800, Amos Kong wrote:
> > > On Thu, Nov 07, 2013 at 08:59:22AM +0200, Michael S. Tsirkin wrote:
> > > > On Tue, Nov 05, 2013 at 07:17:18PM +0800, Amos Kong wrote:
> > > > > We currently just update the HMP NIC info when the last bit of macaddr
> > > > > is written. This assumes that guest driver will write all the macaddr
> > > > > from bit 0 to bit 5 when it changes the macaddr, this is the current
> > > > > behavior of linux driver (e1000/rtl8139cp), but we can't do this
> > > > > assumption.
> > > > > 
> > > > > The macaddr that is used for rx-filter will be updated when every bit
> > > > > is changed. This patch updates the e1000/rtl8139 nic to update HMP NIC
> > > > > info when every bit is changed. It will be same as virtio-net.
> > > > > 
> > > > > Signed-off-by: Amos Kong 
> > > > 
> > > > I'm not sure I buy this.
> > > > 
> > > > If we actually implement e.g. mac change notifications,
> > > > sending them on writes of random bytes will confuse
> > > > the host.
> > >
> > > This patch just effects the monitor display of macaddr.
> > > During each writing, the macaddr is used for rx-filter is really
> > > changed.
> > > 
> > > In the real hardware, it supports to just write part of bits,
> > > the rx-filtering is effected by every bit writing.
> > 
> > Yes but again, the window can just be too small to matter
> > on real hardware.
> > 
> > Our emulation is not perfect, fixing this to be just like real
> > hardware just might expose other bugs we can't fix
> > that easily.
> 
> If we were to implement mac change notification, then every partial
> update would send a notify and the host.  Is that a problem?  It seems
> no different than if the device had an atomic mac update procedure and
> the guest admin changed the mac several times.

I think modern nics make address updates atomic.
Problem is, we are emulating an ancient one,
so to make host configuration of a modern one
reasonable we need to resort to tricks.

> The problem with assuming that a given byte is always written last is
> that unless the hardware spec identifies an order, we're just basing our
> code on examples where we know what the guest driver does, either by
> code inspection or access tracing.  If there are examples of guests that
> write the last byte first, then the host will never know the correct mac
> address.  Maybe there are no guests that use the wrong order, but that's
> a pretty exhaustive search.

I agree what we have is a hack. Maybe we need some better hacks.
For example, maybe we should update mac at link state change
(I think  link is always down when mac is updated?).
Needs some thought.

> The patch doesn't change anything about how the NIC operates, only when
> mac changes are updated.  During an update the mac is in a transitory
> state and we can't go back in time to make it atomic on this hardware
> design to avoid a window where the wrong mac may be seen.  I think the
> patch is the right thing to do.  Thanks,
> 
> Alex


Yes but this info is propaged to host NIC by libvirt so it better
make sense.

> > > > I would say let's leave e1000/rtl8139 well alone unless
> > > > we see guests that actually write mac without touching
> > > > the last byte.
> > > 
> > > At least, linux rtl8139cp/e1000 writes macaddr from bit 0 to bit 5.
> > > It works to just watch the last bit.
> > >  
> > > Thanks, Amos
> > > 
> > > > Then think of ways to detect when mac change is done
> > > > for these.
> > > >
> > > > > ---
> > > > >  hw/net/e1000.c   | 2 +-
> > > > >  hw/net/rtl8139.c | 5 +
> > > > >  2 files changed, 2 insertions(+), 5 deletions(-)
> > > > > 
> > > > > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> > > > > index ec8ecd7..2d60639 100644
> > > > > --- a/hw/net/e1000.c
> > > > > +++ b/hw/net/e1000.c
> > > > > @@ -1110,7 +1110,7 @@ mac_writereg(E1000State *s, int index, uint32_t 
> > > > > val)
> > > > >  
> > > > >  s->mac_reg[index] = val;
> > > > >  
> > > > > -if (index == RA + 1) {
> > > > > +if (index == RA || index == RA + 1) {
> > > > >  macaddr[0] = cpu_to_le32(s->mac_reg[RA]);
> > > > >  macaddr[1] = cpu_to_le32(s->mac_reg[RA + 1]);
> > > > >  qemu_format_nic_info_str(qemu_get_queue(s->nic), (uint8_t 
> > > > > *)macaddr);
> > > > > diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
> > > > > index 5329f44..7f2b4db 100644
> > > > > --- a/hw/net/rtl8139.c
> > > > > +++ b/hw/net/rtl8139.c
> > > > > @@ -2741,10 +2741,7 @@ static void rtl8139_io_writeb(void *opaque, 
> > > > > uint8_t addr, uint32_t val)
> > > > >  
> > > > >  switch (addr)
> > > > >  {
> > > > > -case MAC0 ... MAC0+4:
> > > > > -s->phys[addr - MAC0] = val;
> > > > > -break;
> > > > > -case MAC0+5:
> > > > > +case MAC0 ... MAC0+5:
> > > > >  s->phys[addr - MAC0] = val;
> > 

Re: [Qemu-devel] Questions about Spice pv domUs

2013-11-07 Thread Fabio Fantoni

Il 06/11/2013 18:16, Stefano Stabellini ha scritto:

On Tue, 5 Nov 2013, Fabio Fantoni wrote:

Il 30/09/2013 16:56, Fabio Fantoni ha scritto:

I'm trying to implement basic spice support on xen pv domUs.

Test seems ok on Ubuntu 12.04 pv domU except mouse which is not visible.
I also tried agent-mouse=off on qemu spice options but is not working or
maybe spicy (from spice-gtk 0.20) has problem in this case (option to grab
mouse is already enabled).
I can't add vdagent for now on pv because it hasn't  pci support.
Are there xen parts which may give problem with mouse or couldn't be a xen
related problem?

Given that PCI and USB buses are both missing in PV guests, I guess that
the issue might be that spice assumes that the mouse is somehow emulated
by a USB device? I think it could be difficult to disentangle spice
support from usb/pci.
You could try to run only the mouse part of the xenfb protocol to get
mouse support.

BTW where are you running the spice backend? Is it a standalone daemon?



For now I did only fast test forcing qemu parameters for pv dom's on 
libxl code with the same spice paramters used for hvm domUs.
On first test was not working, then I added xenfb vga, and the screen 
works but mouse not.



Qemu parameters on my test was:

libxl: debug: libxl_dm.c:1282:libxl__spawn_local_dm: Spawning device-model
/usr/lib/xen/bin/qemu-system-i386 with arguments:
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
/usr/lib/xen/bin/qemu-system-i386
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -xen-domid
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   19
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -chardev
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-19,server,nowait
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -mon
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
chardev=libxl-cmd,mode=control
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -nodefaults
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm: -xen-attach
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -name
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   PRECISE
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -k
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   it
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -spice
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:
port=6002,tls-port=0,addr=0.0.0.0,disable-ticketing,agent-mouse=off
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -vga
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   xenfb
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -M
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   xenpv
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   -m
libxl: debug: libxl_dm.c:1284:libxl__spawn_local_dm:   1025

I have also another question for qemu developers: I tried to change qemu
-vga parameter to -device but isn't working and I not found nothing on docs
or man. Is xenfb available with new qemu parameter -device?

As I replied in the other email, xenfb is configured and initialized via
xenstore. Why do you want a command line parameter for it?


I tried to do fast greps without find active code about xenfb.
Can you tell me what I must search for find the new part about xenfb please?

Thanks for any reply.



Re: [Qemu-devel] [PATCH for-1.7 v2 0/8] fix address space size issue

2013-11-07 Thread Paolo Bonzini
Il 07/11/2013 16:05, Michael S. Tsirkin ha scritto:
>> > So, ack for patch 5-7-8, which should also be enough to fix the problem
>> > that Luiz reported.
> Not at all. As long as exec.c ignores high bits, any access
> there will end up in the wrong region.

... unless it happens to be the correct region because the whole
2^64-bit range is covered by a single region.  This is what happens for
IOMMUs.

Paolo

> We might not get a crash but we'll get guest memory corruption.




Re: [Qemu-devel] [PATCH for-1.7 v2 0/8] fix address space size issue

2013-11-07 Thread Michael S. Tsirkin
On Thu, Nov 07, 2013 at 03:08:35PM +0100, Paolo Bonzini wrote:
> Il 07/11/2013 11:41, Marcel Apfelbaum ha scritto:
> > A bug reported by Luiz Capitulino let us to find
> > several bugs in memory address space setup.
> > 
> > One issue is that gdb stub can give us arbitrary addresses
> > and we'll try to access them.
> > Since our lookup ignored high bits in the address,
> > we hit a wrong section and got a crash.
> > In fact, PCI devices can access arbitrary addresses too,
> > so we should just make lookup robust against this case.
> > 
> > Another issue has to do with size of regions.
> > memory API uses UINT64_MAX so say "all 64 bit" but
> > some devices mistakenly used INT64_MAX.
> > 
> > It should not affect most systems in practice as
> > everything should be limited by address space size,
> > but it's an API misuse that we should not keep around,
> > and it will become a problem if a system with 64 bit
> > target address hits this path.
> > 
> > Patch 1 introduces TARGET_PHYS_ADDR_SPACE_MAX that is
> > the max size for memory regions rendered by exec. 
> > Patches 2-3 limits the size of memory regions used by exec.c.
> > Patch 4 fixes an actual bug.
> > The rest of patches make code cleaner and more robust.
> 
> I think this is wrong.
> 
> There is no reason why boards should be using TARGET_PHYS_ADDR_SPACE_MAX
> if the PCI address space is 64-bit.  Many files in hw/ do not even have
> access to TARGET_PHYS_ADDR_SPACE_MAX, which is only available to files
> that are compiled per-target.
> 
> The fact that you have to reduce the IOMMU address spaces from 64 bits
> (UINT64_MAX, not the buggy INT64_MAX) to something else is a red flag.
> If the IOMMU supports 64-bit addressing, the IOMMU regions should have
> 2^64 bits as their length.  Yes, it works by chance now, but it _does_
> work for 2^64-bit size regions: you can do DMA to a high address (say
> 0xf << 60), the IOMMU tables will convert to a low address that fits
> within TARGET_PHYS_ADDR_SPACE_BITS, and the everything will be fine.
> Patches 4 and 6 change that.
> 
> So, ack for patch 5-7-8, which should also be enough to fix the problem
> that Luiz reported.

Not at all. As long as exec.c ignores high bits, any access
there will end up in the wrong region.
We might not get a crash but we'll get guest memory corruption.
Not good.

>  For everything else the solution is to make memory
> dispatch use the whole 64 bits, as in
> http://permalink.gmane.org/gmane.comp.emulators.qemu/240229.  If you
> want to delay that to 1.8 that's fine.
> 
> Paolo
> 
> > 
> > Marcel Apfelbaum (3):
> >   exec: declare TARGET_PHYS_ADDR_SPACE_MAX to limit memory regions
> > rendered by exec
> >   hw/alpha: limit iommu-typhoon memory size
> >   hw/ppc: limit iommu-spapr memory size
> > 
> > Michael S. Tsirkin (4):
> >   exec: don't ignore high address bits on lookup
> >   pci: fix address space size for bridge
> >   exec: don't ignore high address bits on set
> >   spapr_pci: s/INT64_MAX/UINT64_MAX/
> > 
> > Paolo Bonzini (1):
> >   pc: s/INT64_MAX/UINT64_MAX/
> > 
> >  exec.c| 18 ++
> >  hw/alpha/typhoon.c|  2 +-
> >  hw/i386/pc_piix.c |  2 +-
> >  hw/i386/pc_q35.c  |  2 +-
> >  hw/pci/pci_bridge.c   |  2 +-
> >  hw/ppc/spapr_iommu.c  |  2 +-
> >  hw/ppc/spapr_pci.c|  2 +-
> >  include/exec/address-spaces.h |  4 
> >  8 files changed, 28 insertions(+), 6 deletions(-)
> > 



[Qemu-devel] [PATCH 2/2] block: Allow the user to define "node-name" option.

2013-11-07 Thread Benoît Canet
As node-name is a separate name space as device-name we can enable it's
definition right now: nobody will use it so no harm involved.

Signed-off-by: Benoit Canet 
---
 block.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 230e71a..132981f 100644
--- a/block.c
+++ b/block.c
@@ -885,7 +885,8 @@ int bdrv_file_open(BlockDriverState **pbs, const char 
*filename,
 options = qdict_new();
 }
 
-bs = bdrv_new("", NULL);
+bs = bdrv_new("", qdict_get_try_str(options, "node-name"));
+qdict_del(options, "node-name");
 bs->options = options;
 options = qdict_clone_shallow(options);
 
@@ -1007,7 +1008,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
sizeof(backing_filename));
 }
 
-bs->backing_hd = bdrv_new("", NULL);
+bs->backing_hd = bdrv_new("", qdict_get_try_str(options, "node-name"));
+qdict_del(options, "node-name");
 
 if (bs->backing_format[0] != '\0') {
 back_drv = bdrv_find_format(bs->backing_format);
-- 
1.8.3.2




[Qemu-devel] [PATCH 0/2] Giving names to graph's BlockDriverState

2013-11-07 Thread Benoît Canet
The following series introduce a new file.node-name property in order to be
able to give a name to each BlockDriverState of the graph.

It also define "undefined" as a special value for node-name; a value that will 
be
used to indicate to the management that it can not manipulate a node because it
was not named.

After this patchset is merged I would like to take care of presenting the graph
to the management. (HMP &&/|| QMP)

Eric: Do you have some ideas on this topic ?

Best regards

Benoît


Benoît Canet (2):
  block: Add bs->node_name to hold the name of a bs node of the bs
graph.
  block: Allow the user to define "node-name" option.

 block.c   | 72 +++
 block/blkverify.c |  2 +-
 block/iscsi.c |  2 +-
 block/vmdk.c  |  2 +-
 block/vvfat.c |  4 +--
 blockdev.c|  8 +++---
 hw/block/xen_disk.c   |  2 +-
 include/block/block.h |  3 +-
 include/block/block_int.h |  9 +-
 qemu-img.c|  6 ++--
 qemu-io.c |  2 +-
 qemu-nbd.c|  2 +-
 12 files changed, 79 insertions(+), 35 deletions(-)

-- 
1.8.3.2




Re: [Qemu-devel] [PATCH v5 2/2] sheepdog: support user-defined redundancy option

2013-11-07 Thread Liu Yuan
On Wed, Nov 06, 2013 at 10:34:24AM +0100, Stefan Hajnoczi wrote:
> On Tue, Nov 05, 2013 at 08:46:07AM -0700, Eric Blake wrote:
> > On 11/05/2013 07:37 AM, Stefan Hajnoczi wrote:
> > 
> > >> +
> > >> +copy = strtol(n1, NULL, 10);
> > >> +if (copy > SD_MAX_COPIES) {
> > >> +return -EINVAL;
> > >> +}
> > 
> > > 
> > > The string manipulation can be simplified using sscanf(3) and
> > > is_numeric() can be dropped:
> > > 
> > > static int parse_redundancy(BDRVSheepdogState *s, const char *opt)
> > > {
> > > struct SheepdogInode *inode = &s->inode;
> > > uint8_t copy, parity;
> > > int n;
> > > 
> > > n = sscanf(opt, "%hhu:%hhu", ©, &parity);
> > 
> > Personally, I detest the use of sscanf() to parse integers out of
> > strings, because POSIX says that behavior is undefined if overflow
> > occurs.  For internal strings, you can get away with it.  But for
> > untrusted input that did not originate in your process, a user can mess
> > you up by passing a string that parses larger than the integer you are
> > trying to store into, where the behavior is unspecified whether it wraps
> > around module 256, parses additional digits, or any other odd behavior.
> >  By the time you've added code to sanitize untrusted input, it's just as
> > fast to use strtol() anyways.
> 
> Hmm...I didn't know that overflow was undefined behavior in POSIX :(.
> 
> In that case forget sscanf(3) can look at the strtol(3) result for
> errors.  There's still no need for a custom is_numeric() function.

Thanks for your comments, I'll remove is_numeric() for v6

Thanks
Yuan



[Qemu-devel] [PATCH v6 2/2] sheepdog: support user-defined redundancy option

2013-11-07 Thread Liu Yuan
Sheepdog support two kinds of redundancy, full replication and erasure coding.

# create a fully replicated vdi with x copies
 -o redundancy=x (1 <= x <= SD_MAX_COPIES)

# create a erasure coded vdi with x data strips and y parity strips
 -o redundancy=x:y (x must be one of {2,4,8,16} and 1 <= y < SD_EC_MAX_STRIP)

E.g, to convert a vdi into sheepdog vdi 'test' with 8:3 erasure coding scheme

$ qemu-img convert -o redundancy=8:3 linux-0.2.img sheepdog:test

Cc: Kevin Wolf 
Cc: Stefan Hajnoczi 
Signed-off-by: Liu Yuan 
---
 block/sheepdog.c  |   74 -
 include/block/block_int.h |1 +
 2 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 66b3ea8..144da4f 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -91,6 +91,14 @@
 #define SD_NR_VDIS   (1U << 24)
 #define SD_DATA_OBJ_SIZE (UINT64_C(1) << 22)
 #define SD_MAX_VDI_SIZE (SD_DATA_OBJ_SIZE * MAX_DATA_OBJS)
+/*
+ * For erasure coding, we use at most SD_EC_MAX_STRIP for data strips and
+ * (SD_EC_MAX_STRIP - 1) for parity strips
+ *
+ * SD_MAX_COPIES is sum of number of data strips and parity strips.
+ */
+#define SD_EC_MAX_STRIP 16
+#define SD_MAX_COPIES (SD_EC_MAX_STRIP * 2 - 1)
 
 #define SD_INODE_SIZE (sizeof(SheepdogInode))
 #define CURRENT_VDI_ID 0
@@ -1495,6 +1503,7 @@ static int do_sd_create(BDRVSheepdogState *s, uint32_t 
*vdi_id, int snapshot)
 hdr.data_length = wlen;
 hdr.vdi_size = s->inode.vdi_size;
 hdr.copy_policy = s->inode.copy_policy;
+hdr.copies = s->inode.nr_copies;
 
 ret = do_req(fd, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
 
@@ -1562,6 +1571,60 @@ out:
 return ret;
 }
 
+/*
+ * Sheepdog support two kinds of redundancy, full replication and erasure
+ * coding.
+ *
+ * # create a fully replicated vdi with x copies
+ * -o redundancy=x (1 <= x <= SD_MAX_COPIES)
+ *
+ * # create a erasure coded vdi with x data strips and y parity strips
+ * -o redundancy=x:y (x must be one of {2,4,8,16} and 1 <= y < SD_EC_MAX_STRIP)
+ */
+static int parse_redundancy(BDRVSheepdogState *s, const char *opt)
+{
+struct SheepdogInode *inode = &s->inode;
+const char *n1, *n2;
+long copy, parity;
+char p[10];
+
+pstrcpy(p, sizeof(p), opt);
+n1 = strtok(p, ":");
+n2 = strtok(NULL, ":");
+
+if (!n1) {
+return -EINVAL;
+}
+
+copy = strtol(n1, NULL, 10);
+if (copy > SD_MAX_COPIES || copy < 1) {
+return -EINVAL;
+}
+if (!n2) {
+inode->copy_policy = 0;
+inode->nr_copies = copy;
+return 0;
+}
+
+if (copy != 2 && copy != 4 && copy != 8 && copy != 16) {
+return -EINVAL;
+}
+
+parity = strtol(n2, NULL, 10);
+if (parity >= SD_EC_MAX_STRIP || parity < 1) {
+return -EINVAL;
+}
+
+/*
+ * 4 bits for parity and 4 bits for data.
+ * We have to compress upper data bits because it can't represent 16
+ */
+inode->copy_policy = ((copy / 2) << 4) + parity;
+inode->nr_copies = copy + parity;
+
+return 0;
+}
+
 static int sd_create(const char *filename, QEMUOptionParameter *options,
  Error **errp)
 {
@@ -1602,6 +1665,11 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options,
 ret = -EINVAL;
 goto out;
 }
+} else if (!strcmp(options->name, BLOCK_OPT_REDUNDANCY)) {
+ret = parse_redundancy(s, options->value.s);
+if (ret < 0) {
+goto out;
+}
 }
 options++;
 }
@@ -1644,7 +1712,6 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options,
 bdrv_unref(bs);
 }
 
-/* TODO: allow users to specify copy number */
 ret = do_sd_create(s, &vid, 0);
 if (!prealloc || ret) {
 goto out;
@@ -2416,6 +2483,11 @@ static QEMUOptionParameter sd_create_options[] = {
 .type = OPT_STRING,
 .help = "Preallocation mode (allowed values: off, full)"
 },
+{
+.name = BLOCK_OPT_REDUNDANCY,
+.type = OPT_STRING,
+.help = "Redundancy of the image"
+},
 { NULL }
 };
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index a48731d..b90862f 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -53,6 +53,7 @@
 #define BLOCK_OPT_COMPAT_LEVEL  "compat"
 #define BLOCK_OPT_LAZY_REFCOUNTS"lazy_refcounts"
 #define BLOCK_OPT_ADAPTER_TYPE  "adapter_type"
+#define BLOCK_OPT_REDUNDANCY"redundancy"
 
 typedef struct BdrvTrackedRequest {
 BlockDriverState *bs;
-- 
1.7.9.5




[Qemu-devel] [PATCH v6 RESENT 0/2] sheepdog: add user-defined redundancy option

2013-11-07 Thread Liu Yuan
v6:
 - update comment typo
 - remove is_number()

v5:
 - use pstrcpy instead of strncpy
 - fix a segfalt for 'null' string option string

v4:
 - fix do_sd_create that forgot to pass nr_copies
 - fix parse_redundancy dealing with replicated vdi

v3:
 - rework is_numeric

v2:
 - fix a typo in comment and commit log

 This patch set add one sheepdog specific option for qemu-img to control
 redundancy.

 This patch set is on top of Kevin's block tree.

Liu Yuan (2):
  sheepdog: refactor do_sd_create()
  sheepdog: support user-defined redundancy option

 block/sheepdog.c  |  127 +
 include/block/block_int.h |1 +
 2 files changed, 105 insertions(+), 23 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCH v6 1/2] sheepdog: refactor do_sd_create()

2013-11-07 Thread Liu Yuan
We can actually use BDRVSheepdogState *s to pass most of the parameters.

Cc: Kevin Wolf 
Cc: Stefan Hajnoczi 
Signed-off-by: Liu Yuan 
---
 block/sheepdog.c |   37 +++--
 1 file changed, 15 insertions(+), 22 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index ef387de..66b3ea8 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1464,9 +1464,7 @@ out:
 return ret;
 }
 
-static int do_sd_create(BDRVSheepdogState *s, char *filename, int64_t vdi_size,
-uint32_t base_vid, uint32_t *vdi_id, int snapshot,
-uint8_t copy_policy)
+static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot)
 {
 SheepdogVdiReq hdr;
 SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
@@ -1483,11 +1481,11 @@ static int do_sd_create(BDRVSheepdogState *s, char 
*filename, int64_t vdi_size,
  * does not fit in buf?  For now, just truncate and avoid buffer overrun.
  */
 memset(buf, 0, sizeof(buf));
-pstrcpy(buf, sizeof(buf), filename);
+pstrcpy(buf, sizeof(buf), s->name);
 
 memset(&hdr, 0, sizeof(hdr));
 hdr.opcode = SD_OP_NEW_VDI;
-hdr.vdi_id = base_vid;
+hdr.vdi_id = s->inode.vdi_id;
 
 wlen = SD_MAX_VDI_LEN;
 
@@ -1495,8 +1493,8 @@ static int do_sd_create(BDRVSheepdogState *s, char 
*filename, int64_t vdi_size,
 hdr.snapid = snapshot;
 
 hdr.data_length = wlen;
-hdr.vdi_size = vdi_size;
-hdr.copy_policy = copy_policy;
+hdr.vdi_size = s->inode.vdi_size;
+hdr.copy_policy = s->inode.copy_policy;
 
 ret = do_req(fd, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
 
@@ -1507,7 +1505,7 @@ static int do_sd_create(BDRVSheepdogState *s, char 
*filename, int64_t vdi_size,
 }
 
 if (rsp->result != SD_RES_SUCCESS) {
-error_report("%s, %s", sd_strerror(rsp->result), filename);
+error_report("%s, %s", sd_strerror(rsp->result), s->inode.name);
 return -EIO;
 }
 
@@ -1568,23 +1566,21 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options,
  Error **errp)
 {
 int ret = 0;
-uint32_t vid = 0, base_vid = 0;
-int64_t vdi_size = 0;
+uint32_t vid = 0;
 char *backing_file = NULL;
 BDRVSheepdogState *s;
-char vdi[SD_MAX_VDI_LEN], tag[SD_MAX_VDI_TAG_LEN];
+char tag[SD_MAX_VDI_TAG_LEN];
 uint32_t snapid;
 bool prealloc = false;
 Error *local_err = NULL;
 
 s = g_malloc0(sizeof(BDRVSheepdogState));
 
-memset(vdi, 0, sizeof(vdi));
 memset(tag, 0, sizeof(tag));
 if (strstr(filename, "://")) {
-ret = sd_parse_uri(s, filename, vdi, &snapid, tag);
+ret = sd_parse_uri(s, filename, s->name, &snapid, tag);
 } else {
-ret = parse_vdiname(s, filename, vdi, &snapid, tag);
+ret = parse_vdiname(s, filename, s->name, &snapid, tag);
 }
 if (ret < 0) {
 goto out;
@@ -1592,7 +1588,7 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options,
 
 while (options && options->name) {
 if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
-vdi_size = options->value.n;
+s->inode.vdi_size = options->value.n;
 } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
 backing_file = options->value.s;
 } else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
@@ -1610,7 +1606,7 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options,
 options++;
 }
 
-if (vdi_size > SD_MAX_VDI_SIZE) {
+if (s->inode.vdi_size > SD_MAX_VDI_SIZE) {
 error_report("too big image size");
 ret = -EINVAL;
 goto out;
@@ -1645,12 +1641,11 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options,
 goto out;
 }
 
-base_vid = s->inode.vdi_id;
 bdrv_unref(bs);
 }
 
 /* TODO: allow users to specify copy number */
-ret = do_sd_create(s, vdi, vdi_size, base_vid, &vid, 0, 0);
+ret = do_sd_create(s, &vid, 0);
 if (!prealloc || ret) {
 goto out;
 }
@@ -1833,8 +1828,7 @@ static int sd_create_branch(BDRVSheepdogState *s)
  * false bail out.
  */
 deleted = sd_delete(s);
-ret = do_sd_create(s, s->name, s->inode.vdi_size, s->inode.vdi_id, &vid,
-   !deleted, s->inode.copy_policy);
+ret = do_sd_create(s, &vid, !deleted);
 if (ret) {
 goto out;
 }
@@ -2097,8 +2091,7 @@ static int sd_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info)
 goto cleanup;
 }
 
-ret = do_sd_create(s, s->name, s->inode.vdi_size, s->inode.vdi_id, 
&new_vid,
-   1, s->inode.copy_policy);
+ret = do_sd_create(s, &new_vid, 1);
 if (ret < 0) {
 error_report("failed to create inode for snapshot. %s",
  strerror(errno));
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH] e1000/rtl8139: update HMP NIC when every bit is written

2013-11-07 Thread Alex Williamson
On Thu, 2013-11-07 at 12:26 +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 07, 2013 at 03:32:29PM +0800, Amos Kong wrote:
> > On Thu, Nov 07, 2013 at 08:59:22AM +0200, Michael S. Tsirkin wrote:
> > > On Tue, Nov 05, 2013 at 07:17:18PM +0800, Amos Kong wrote:
> > > > We currently just update the HMP NIC info when the last bit of macaddr
> > > > is written. This assumes that guest driver will write all the macaddr
> > > > from bit 0 to bit 5 when it changes the macaddr, this is the current
> > > > behavior of linux driver (e1000/rtl8139cp), but we can't do this
> > > > assumption.
> > > > 
> > > > The macaddr that is used for rx-filter will be updated when every bit
> > > > is changed. This patch updates the e1000/rtl8139 nic to update HMP NIC
> > > > info when every bit is changed. It will be same as virtio-net.
> > > > 
> > > > Signed-off-by: Amos Kong 
> > > 
> > > I'm not sure I buy this.
> > > 
> > > If we actually implement e.g. mac change notifications,
> > > sending them on writes of random bytes will confuse
> > > the host.
> >
> > This patch just effects the monitor display of macaddr.
> > During each writing, the macaddr is used for rx-filter is really
> > changed.
> > 
> > In the real hardware, it supports to just write part of bits,
> > the rx-filtering is effected by every bit writing.
> 
> Yes but again, the window can just be too small to matter
> on real hardware.
> 
> Our emulation is not perfect, fixing this to be just like real
> hardware just might expose other bugs we can't fix
> that easily.

If we were to implement mac change notification, then every partial
update would send a notify and the host.  Is that a problem?  It seems
no different than if the device had an atomic mac update procedure and
the guest admin changed the mac several times.

The problem with assuming that a given byte is always written last is
that unless the hardware spec identifies an order, we're just basing our
code on examples where we know what the guest driver does, either by
code inspection or access tracing.  If there are examples of guests that
write the last byte first, then the host will never know the correct mac
address.  Maybe there are no guests that use the wrong order, but that's
a pretty exhaustive search.

The patch doesn't change anything about how the NIC operates, only when
mac changes are updated.  During an update the mac is in a transitory
state and we can't go back in time to make it atomic on this hardware
design to avoid a window where the wrong mac may be seen.  I think the
patch is the right thing to do.  Thanks,

Alex

> > > I would say let's leave e1000/rtl8139 well alone unless
> > > we see guests that actually write mac without touching
> > > the last byte.
> > 
> > At least, linux rtl8139cp/e1000 writes macaddr from bit 0 to bit 5.
> > It works to just watch the last bit.
> >  
> > Thanks, Amos
> > 
> > > Then think of ways to detect when mac change is done
> > > for these.
> > >
> > > > ---
> > > >  hw/net/e1000.c   | 2 +-
> > > >  hw/net/rtl8139.c | 5 +
> > > >  2 files changed, 2 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> > > > index ec8ecd7..2d60639 100644
> > > > --- a/hw/net/e1000.c
> > > > +++ b/hw/net/e1000.c
> > > > @@ -1110,7 +1110,7 @@ mac_writereg(E1000State *s, int index, uint32_t 
> > > > val)
> > > >  
> > > >  s->mac_reg[index] = val;
> > > >  
> > > > -if (index == RA + 1) {
> > > > +if (index == RA || index == RA + 1) {
> > > >  macaddr[0] = cpu_to_le32(s->mac_reg[RA]);
> > > >  macaddr[1] = cpu_to_le32(s->mac_reg[RA + 1]);
> > > >  qemu_format_nic_info_str(qemu_get_queue(s->nic), (uint8_t 
> > > > *)macaddr);
> > > > diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
> > > > index 5329f44..7f2b4db 100644
> > > > --- a/hw/net/rtl8139.c
> > > > +++ b/hw/net/rtl8139.c
> > > > @@ -2741,10 +2741,7 @@ static void rtl8139_io_writeb(void *opaque, 
> > > > uint8_t addr, uint32_t val)
> > > >  
> > > >  switch (addr)
> > > >  {
> > > > -case MAC0 ... MAC0+4:
> > > > -s->phys[addr - MAC0] = val;
> > > > -break;
> > > > -case MAC0+5:
> > > > +case MAC0 ... MAC0+5:
> > > >  s->phys[addr - MAC0] = val;
> > > >  qemu_format_nic_info_str(qemu_get_queue(s->nic), s->phys);
> > > >  break;
> > > > -- 
> > > > 1.8.3.1
> 






[Qemu-devel] [PATCH -V7 3/3] target-ppc: Fix page table lookup with kvm enabled

2013-11-07 Thread Aneesh Kumar K.V
From: "Aneesh Kumar K.V" 

With kvm enabled, we store the hash page table information in the hypervisor.
Use ioctl to read the htab contents. Without this we get the below error when
trying to read the guest address

 (gdb) x/10 do_fork
 0xc0098660 :   Cannot access memory at address 
0xc0098660
 (gdb)

Signed-off-by: Aneesh Kumar K.V 
---
Changes from V6:
* drop htab_fd argument and use global variable kvmppc_kern_htab instead

 hw/ppc/spapr.c  |  1 +
 hw/ppc/spapr_hcall.c| 50 +++
 target-ppc/kvm.c| 53 +
 target-ppc/kvm_ppc.h| 19 
 target-ppc/mmu-hash64.c | 78 -
 target-ppc/mmu-hash64.h | 23 ++-
 6 files changed, 183 insertions(+), 41 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d4f3502..8bf886e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -662,6 +662,7 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
 if (shift > 0) {
 /* Kernel handles htab, we don't need to allocate one */
 spapr->htab_shift = shift;
+kvmppc_kern_htab = true;
 } else {
 if (!spapr->htab) {
 /* Allocate an htab if we don't yet have one */
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index f10ba8a..f9ea691 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -50,8 +50,9 @@ static target_ulong h_enter(PowerPCCPU *cpu, sPAPREnvironment 
*spapr,
 target_ulong ptel = args[3];
 target_ulong page_shift = 12;
 target_ulong raddr;
-target_ulong i;
+target_ulong index;
 hwaddr hpte;
+void *token;
 
 /* only handle 4k and 16M pages for now */
 if (pteh & HPTE64_V_LARGE) {
@@ -94,30 +95,37 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 if ((pte_index * HASH_PTE_SIZE_64) & ~env->htab_mask) {
 return H_PARAMETER;
 }
+
+index = 0;
+hpte = pte_index * HASH_PTE_SIZE_64;
 if (likely((flags & H_EXACT) == 0)) {
 pte_index &= ~7ULL;
-hpte = pte_index * HASH_PTE_SIZE_64;
-for (i = 0; ; ++i) {
-if (i == 8) {
+token = ppc_hash64_start_access(cpu, pte_index);
+do {
+if (index == 8) {
+ppc_hash64_stop_access(token);
 return H_PTEG_FULL;
 }
-if ((ppc_hash64_load_hpte0(env, hpte) & HPTE64_V_VALID) == 0) {
+if ((ppc_hash64_load_hpte0(env, token, index) & HPTE64_V_VALID) == 
0) {
 break;
 }
-hpte += HASH_PTE_SIZE_64;
-}
+} while (index++);
+ppc_hash64_stop_access(token);
 } else {
-i = 0;
-hpte = pte_index * HASH_PTE_SIZE_64;
-if (ppc_hash64_load_hpte0(env, hpte) & HPTE64_V_VALID) {
+token = ppc_hash64_start_access(cpu, pte_index);
+if (ppc_hash64_load_hpte0(env, token, 0) & HPTE64_V_VALID) {
+ppc_hash64_stop_access(token);
 return H_PTEG_FULL;
 }
+ppc_hash64_stop_access(token);
 }
+hpte += index * HASH_PTE_SIZE_64;
+
 ppc_hash64_store_hpte1(env, hpte, ptel);
 /* eieio();  FIXME: need some sort of barrier for smp? */
 ppc_hash64_store_hpte0(env, hpte, pteh | HPTE64_V_HPTE_DIRTY);
 
-args[0] = pte_index + i;
+args[0] = pte_index + index;
 return H_SUCCESS;
 }
 
@@ -134,16 +142,17 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
target_ulong ptex,
 target_ulong *vp, target_ulong *rp)
 {
 hwaddr hpte;
+void *token;
 target_ulong v, r, rb;
 
 if ((ptex * HASH_PTE_SIZE_64) & ~env->htab_mask) {
 return REMOVE_PARM;
 }
 
-hpte = ptex * HASH_PTE_SIZE_64;
-
-v = ppc_hash64_load_hpte0(env, hpte);
-r = ppc_hash64_load_hpte1(env, hpte);
+token = ppc_hash64_start_access(ppc_env_get_cpu(env), ptex);
+v = ppc_hash64_load_hpte0(env, token, 0);
+r = ppc_hash64_load_hpte1(env, token, 0);
+ppc_hash64_stop_access(token);
 
 if ((v & HPTE64_V_VALID) == 0 ||
 ((flags & H_AVPN) && (v & ~0x7fULL) != avpn) ||
@@ -152,6 +161,7 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
target_ulong ptex,
 }
 *vp = v;
 *rp = r;
+hpte = ptex * HASH_PTE_SIZE_64;
 ppc_hash64_store_hpte0(env, hpte, HPTE64_V_HPTE_DIRTY);
 rb = compute_tlbie_rb(v, r, ptex);
 ppc_tlb_invalidate_one(env, rb);
@@ -260,16 +270,17 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 target_ulong pte_index = args[1];
 target_ulong avpn = args[2];
 hwaddr hpte;
+void *token;
 target_ulong v, r, rb;
 
 if ((pte_index * HASH_PTE_SIZE_64) & ~env->htab_mask) {
 return H_PARAMETER;
 }
 
-hpte = pte_index * HASH_PTE_SIZE_64;
-
-v = ppc_hash64_load_hpte0(env, hpte);
-r = ppc_hash64_load_hpte1(env, hpte);
+token = ppc_hash64_start_access(cpu, pte_index);
+  

[Qemu-devel] [PATCH -V7 2/3] target-ppc: Fix htab_mask calculation

2013-11-07 Thread Aneesh Kumar K.V
From: "Aneesh Kumar K.V" 

Correctly update the htab_mask using the return value of
KVM_PPC_ALLOCATE_HTAB ioctl. Also we don't update sdr1
on GET_SREGS for HV. So don't update htab_mask if sdr1
is found to be zero. Fix the pte index calculation to be
same as that found in the kernel

Signed-off-by: Aneesh Kumar K.V 
---
 hw/ppc/spapr.c  | 3 ++-
 target-ppc/mmu_helper.c | 4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 22f2a8a..d4f3502 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -724,7 +724,8 @@ static void spapr_cpu_reset(void *opaque)
 env->external_htab = (void *)1;
 }
 env->htab_base = -1;
-env->htab_mask = HTAB_SIZE(spapr) - 1;
+/* 128 (2**7) bytes in each HPTEG */
+env->htab_mask = (1ULL << ((spapr)->htab_shift - 7)) - 1;
 env->spr[SPR_SDR1] = (target_ulong)(uintptr_t)spapr->htab |
 (spapr->htab_shift - 18);
 }
diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c
index 04a840b..c39cb7b 100644
--- a/target-ppc/mmu_helper.c
+++ b/target-ppc/mmu_helper.c
@@ -2025,7 +2025,9 @@ void ppc_store_sdr1(CPUPPCState *env, target_ulong value)
 " stored in SDR1\n", htabsize);
 htabsize = 28;
 }
-env->htab_mask = (1ULL << (htabsize + 18)) - 1;
+if (htabsize) {
+env->htab_mask = (1ULL << (htabsize + 18 - 7)) - 1;
+}
 env->htab_base = value & SDR_64_HTABORG;
 } else
 #endif /* defined(TARGET_PPC64) */
-- 
1.8.3.2




[Qemu-devel] [PATCH -V7 1/3] target-ppc: Update external_htab even when HTAB is managed by kernel

2013-11-07 Thread Aneesh Kumar K.V
From: "Aneesh Kumar K.V" 

We will use this in later patches to make sure we use the right load
functions when copying hpte entries.

Signed-off-by: Aneesh Kumar K.V 
---
 hw/ppc/spapr.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 004184d..22f2a8a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -716,6 +716,13 @@ static void spapr_cpu_reset(void *opaque)
 env->spr[SPR_HIOR] = 0;
 
 env->external_htab = (uint8_t *)spapr->htab;
+if (kvm_enabled() && !env->external_htab) {
+/*
+ * HV KVM, set external_htab to 1 so our ppc_hash64_load_hpte*
+ * functions do the right thing.
+ */
+env->external_htab = (void *)1;
+}
 env->htab_base = -1;
 env->htab_mask = HTAB_SIZE(spapr) - 1;
 env->spr[SPR_SDR1] = (target_ulong)(uintptr_t)spapr->htab |
-- 
1.8.3.2




[Qemu-devel] [PULL 35/36] block: qemu-iotests for vhdx, add write test support

2013-11-07 Thread Stefan Hajnoczi
From: Jeff Cody 

This removes the IMGFMT_GENERIC blocker for read-only, so existing
iotests run read/write tests for vhdx images created by qemu-img (e.g.
tests 001, 002, 003).

In addition, this updates the sample image test for the Hyper-V
created image, to verify we can write it as well.

Signed-off-by: Jeff Cody 
Signed-off-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/064 | 11 +++
 tests/qemu-iotests/064.out | 14 ++
 tests/qemu-iotests/common  |  1 -
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/064 b/tests/qemu-iotests/064
index 6789aa6..1c74c31 100755
--- a/tests/qemu-iotests/064
+++ b/tests/qemu-iotests/064
@@ -56,6 +56,17 @@ echo
 echo "=== Verify pattern 0x00, 66M - 1024M ==="
 $QEMU_IO -r -c "read -pP 0x00 66M 958M" "$TEST_IMG" | _filter_qemu_io
 
+echo
+echo "=== Verify pattern write, 0xc3 99M-157M ==="
+$QEMU_IO -c "write -pP 0xc3 99M 58M" "$TEST_IMG" | _filter_qemu_io
+# first verify we didn't write where we should not have
+$QEMU_IO -c "read -pP 0xa5 0 33M" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "read -pP 0x96 33M 33M" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "read -pP 0x00 66M 33M" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "read -pP 0x00 157MM 867MM" "$TEST_IMG" | _filter_qemu_io
+# now verify what we should have actually written
+$QEMU_IO -c "read -pP 0xc3 99M 58M" "$TEST_IMG" | _filter_qemu_io
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/064.out b/tests/qemu-iotests/064.out
index b9e8e4a..5346a4e 100644
--- a/tests/qemu-iotests/064.out
+++ b/tests/qemu-iotests/064.out
@@ -11,4 +11,18 @@ read 34603008/34603008 bytes at offset 34603008
 === Verify pattern 0x00, 66M - 1024M ===
 read 1004535808/1004535808 bytes at offset 69206016
 958 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Verify pattern write, 0xc3 99M-157M ===
+wrote 60817408/60817408 bytes at offset 103809024
+58 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 34603008/34603008 bytes at offset 0
+33 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 34603008/34603008 bytes at offset 34603008
+33 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 34603008/34603008 bytes at offset 69206016
+33 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 909115392/909115392 bytes at offset 164626432
+867 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 60817408/60817408 bytes at offset 103809024
+58 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 *** done
diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index 2932e14..8cde7f1 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -200,7 +200,6 @@ testlist options
 -vhdx)
 IMGFMT=vhdx
 xpand=false
-IMGFMT_GENERIC=false
 ;;
 
 -rbd)
-- 
1.8.3.1




[Qemu-devel] [PATCH -V2] kvm: Add a new machine property kvm_type

2013-11-07 Thread Aneesh Kumar K.V
From: "Aneesh Kumar K.V" 

Targets like ppc64 support different typed of KVM, one which use
hypervisor mode and the other which doesn't. Add a new machine
property kvm_type that helps in selecting the respective ones
We also add a new QEMUMachine callback get_vm_type that helps
in mapping the string representation of kvm type specified.

Signed-off-by: Aneesh Kumar K.V 
---

Changes from V1:
* change get_vm_type to kvm_type
* change helper text

 hw/ppc/e500plat.c  |  2 ++
 hw/ppc/kvmtype.h   | 18 ++
 hw/ppc/mac_newworld.c  |  2 ++
 hw/ppc/mac_oldworld.c  |  2 ++
 hw/ppc/ppc440_bamboo.c |  2 ++
 hw/ppc/spapr.c | 19 +++
 include/hw/boards.h|  3 +++
 include/hw/xen/xen.h   |  3 ++-
 include/sysemu/kvm.h   |  4 ++--
 include/sysemu/qtest.h |  5 +++--
 kvm-all.c  | 16 +---
 kvm-stub.c |  4 +++-
 qtest.c|  2 +-
 vl.c   | 17 +++--
 xen-all.c  |  2 +-
 xen-stub.c |  2 +-
 16 files changed, 85 insertions(+), 18 deletions(-)
 create mode 100644 hw/ppc/kvmtype.h

diff --git a/hw/ppc/e500plat.c b/hw/ppc/e500plat.c
index 2e964b2..4be4b24 100644
--- a/hw/ppc/e500plat.c
+++ b/hw/ppc/e500plat.c
@@ -17,6 +17,7 @@
 #include "hw/pci/pci.h"
 #include "hw/ppc/openpic.h"
 #include "kvm_ppc.h"
+#include "kvmtype.h"
 
 static void e500plat_fixup_devtree(PPCE500Params *params, void *fdt)
 {
@@ -51,6 +52,7 @@ static QEMUMachine e500plat_machine = {
 .desc = "generic paravirt e500 platform",
 .init = e500plat_init,
 .max_cpus = 32,
+.kvm_type = pr_kvm_type,
 };
 
 static void e500plat_machine_init(void)
diff --git a/hw/ppc/kvmtype.h b/hw/ppc/kvmtype.h
new file mode 100644
index 000..dbe4f50
--- /dev/null
+++ b/hw/ppc/kvmtype.h
@@ -0,0 +1,18 @@
+#ifndef PPC_KVMTYPE_H
+#define PPC_KVMTYPE_H
+
+static inline int pr_kvm_type(const char *vm_type)
+{
+if (!vm_type) {
+return 0;
+}
+
+if (!strcmp(vm_type, "PR")) {
+return 2;
+}
+
+hw_error("Unknown kvm_type specified '%s'", vm_type);
+exit(1);
+}
+
+#endif
diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 5e79575..0e09800 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -68,6 +68,7 @@
 #include "sysemu/blockdev.h"
 #include "exec/address-spaces.h"
 #include "hw/sysbus.h"
+#include "kvmtype.h"
 
 #define MAX_IDE_BUS 2
 #define CFG_ADDR 0xf510
@@ -478,6 +479,7 @@ static QEMUMachine core99_machine = {
 .init = ppc_core99_init,
 .max_cpus = MAX_CPUS,
 .default_boot_order = "cd",
+.kvm_type = pr_kvm_type,
 };
 
 static void core99_machine_init(void)
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 2f27754..02b200e 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -42,6 +42,7 @@
 #include "kvm_ppc.h"
 #include "sysemu/blockdev.h"
 #include "exec/address-spaces.h"
+#include "kvmtype.h"
 
 #define MAX_IDE_BUS 2
 #define CFG_ADDR 0xf510
@@ -351,6 +352,7 @@ static QEMUMachine heathrow_machine = {
 .is_default = 1,
 #endif
 .default_boot_order = "cd", /* TOFIX "cad" when Mac floppy is implemented 
*/
+.kvm_type = pr_kvm_type,
 };
 
 static void heathrow_machine_init(void)
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index 655e499..903d305 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -28,6 +28,7 @@
 #include "ppc405.h"
 #include "sysemu/sysemu.h"
 #include "hw/sysbus.h"
+#include "kvmtype.h"
 
 #define BINARY_DEVICE_TREE_FILE "bamboo.dtb"
 
@@ -296,6 +297,7 @@ static QEMUMachine bamboo_machine = {
 .name = "bamboo",
 .desc = "bamboo",
 .init = bamboo_init,
+.kvm_type = pr_kvm_type,
 };
 
 static void bamboo_machine_init(void)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8bf886e..5733396 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1346,6 +1346,24 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 assert(spapr->fdt_skel != NULL);
 }
 
+static int spapr_kvm_type(const char *vm_type)
+{
+if (!vm_type) {
+return 0;
+}
+
+if (!strcmp(vm_type, "HV")) {
+return 1;
+}
+
+if (!strcmp(vm_type, "PR")) {
+return 2;
+}
+
+hw_error("Unknown kvm_type specified '%s'", vm_type);
+exit(1);
+}
+
 static QEMUMachine spapr_machine = {
 .name = "pseries",
 .desc = "pSeries Logical Partition (PAPR compliant)",
@@ -1356,6 +1374,7 @@ static QEMUMachine spapr_machine = {
 .max_cpus = MAX_CPUS,
 .no_parallel = 1,
 .default_boot_order = NULL,
+.kvm_type = spapr_kvm_type,
 };
 
 static void spapr_machine_init(void)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 5a7ae9f..4d93bca 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -21,6 +21,8 @@ typedef void QEMUMachineResetFunc(void);
 
 typedef void QEMUMachineHotAddCPUFunc(const int64_t id, Error **errp);
 
+typedef int QEMUMachineGetKvmtypeFunc(const char *arg);
+
 typedef struct QEMUMa

[Qemu-devel] [PULL 14/36] blockdev-test: add test case for drive_add duplicate IDs

2013-11-07 Thread Stefan Hajnoczi
The following should work:

  (qemu) drive_add if=none,id=drive0
  (qemu) drive_del drive0
  (qemu) drive_add if=none,id=drive0

Previous versions of QEMU produced a duplicate ID error because
drive_add leaked the options.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
 tests/Makefile|  2 ++
 tests/blockdev-test.c | 59 +++
 2 files changed, 61 insertions(+)
 create mode 100644 tests/blockdev-test.c

diff --git a/tests/Makefile b/tests/Makefile
index f414f2c..973f497 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -68,6 +68,7 @@ check-qtest-i386-y += tests/rtc-test$(EXESUF)
 check-qtest-i386-y += tests/i440fx-test$(EXESUF)
 check-qtest-i386-y += tests/fw_cfg-test$(EXESUF)
 check-qtest-i386-y += tests/qom-test$(EXESUF)
+check-qtest-i386-y += tests/blockdev-test$(EXESUF)
 check-qtest-x86_64-y = $(check-qtest-i386-y)
 gcov-files-i386-y += i386-softmmu/hw/mc146818rtc.c
 gcov-files-x86_64-y = $(subst 
i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))
@@ -200,6 +201,7 @@ tests/tmp105-test$(EXESUF): tests/tmp105-test.o 
$(libqos-omap-obj-y)
 tests/i440fx-test$(EXESUF): tests/i440fx-test.o $(libqos-pc-obj-y)
 tests/fw_cfg-test$(EXESUF): tests/fw_cfg-test.o $(libqos-pc-obj-y)
 tests/qom-test$(EXESUF): tests/qom-test.o
+tests/blockdev-test$(EXESUF): tests/blockdev-test.o $(libqos-pc-obj-y)
 tests/qemu-iotests/socket_scm_helper$(EXESUF): 
tests/qemu-iotests/socket_scm_helper.o
 
 # QTest rules
diff --git a/tests/blockdev-test.c b/tests/blockdev-test.c
new file mode 100644
index 000..c940e00
--- /dev/null
+++ b/tests/blockdev-test.c
@@ -0,0 +1,59 @@
+/*
+ * blockdev.c test cases
+ *
+ * Copyright (C) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Stefan Hajnoczi 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include 
+#include 
+#include "libqtest.h"
+
+static void test_drive_add_empty(void)
+{
+QDict *response;
+const char *response_return;
+
+/* Start with an empty drive */
+qtest_start("-drive if=none,id=drive0");
+
+/* Delete the drive */
+response = qmp("{\"execute\": \"human-monitor-command\","
+   " \"arguments\": {"
+   "   \"command-line\": \"drive_del drive0\""
+   "}}");
+g_assert(response);
+response_return = qdict_get_try_str(response, "return");
+g_assert(response_return);
+g_assert(strcmp(response_return, "") == 0);
+QDECREF(response);
+
+/* Ensure re-adding the drive works - there should be no duplicate ID error
+ * because the old drive must be gone.
+ */
+response = qmp("{\"execute\": \"human-monitor-command\","
+   " \"arguments\": {"
+   "   \"command-line\": \"drive_add 0 if=none,id=drive0\""
+   "}}");
+g_assert(response);
+response_return = qdict_get_try_str(response, "return");
+g_assert(response_return);
+g_assert(strcmp(response_return, "OK\r\n") == 0);
+QDECREF(response);
+
+qtest_end();
+}
+
+int main(int argc, char **argv)
+{
+g_test_init(&argc, &argv, NULL);
+
+qtest_add_func("/qmp/drive_add_empty", test_drive_add_empty);
+
+return g_test_run();
+}
-- 
1.8.3.1




[Qemu-devel] [PULL 34/36] block: vhdx - update _make_test_img() to filter out vhdx options

2013-11-07 Thread Stefan Hajnoczi
From: Jeff Cody 

The non-global option output is suppresed in _make_test_img() for
output verification in the 0?? tests.  This adds suppression for
the vhdx-unique options as well.  This allows check -vhdx to run
successfully.

Signed-off-by: Jeff Cody 
Signed-off-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/common.rc | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index d24de2c..7f62457 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -157,7 +157,10 @@ _make_test_img()
 -e "s# zeroed_grain=\\(on\\|off\\)##g" \
 -e "s# subformat='[^']*'##g" \
 -e "s# adapter_type='[^']*'##g" \
--e "s# lazy_refcounts=\\(on\\|off\\)##g"
+-e "s# lazy_refcounts=\\(on\\|off\\)##g" \
+-e "s# block_size=[0-9]\\+##g" \
+-e "s# block_state_zero=\\(on\\|off\\)##g" \
+-e "s# log_size=[0-9]\\+##g"
 
 # Start an NBD server on the image file, which is what we'll be talking to
 if [ $IMGPROTO = "nbd" ]; then
-- 
1.8.3.1




[Qemu-devel] [PULL 13/36] libqtest: add qmp(fmt, ...) -> QDict* function

2013-11-07 Thread Stefan Hajnoczi
Add a qtest qmp() function that returns the response object.  This
allows test cases to verify the result or to check for error responses.
It also allows waiting for QMP events.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Reviewed-by: Andreas Färber 
---
 tests/libqtest.c | 66 
 tests/libqtest.h | 37 +++
 2 files changed, 89 insertions(+), 14 deletions(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index dc4c983..83424c3 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -30,6 +30,8 @@
 
 #include "qemu/compiler.h"
 #include "qemu/osdep.h"
+#include "qapi/qmp/json-streamer.h"
+#include "qapi/qmp/json-parser.h"
 
 #define MAX_IRQ 256
 
@@ -291,16 +293,38 @@ redo:
 return words;
 }
 
-void qtest_qmpv_discard_response(QTestState *s, const char *fmt, va_list ap)
+typedef struct {
+JSONMessageParser parser;
+QDict *response;
+} QMPResponseParser;
+
+static void qmp_response(JSONMessageParser *parser, QList *tokens)
 {
-bool has_reply = false;
-int nesting = 0;
+QMPResponseParser *qmp = container_of(parser, QMPResponseParser, parser);
+QObject *obj;
+
+obj = json_parser_parse(tokens, NULL);
+if (!obj) {
+fprintf(stderr, "QMP JSON response parsing failed\n");
+exit(1);
+}
+
+g_assert(qobject_type(obj) == QTYPE_QDICT);
+g_assert(!qmp->response);
+qmp->response = (QDict *)obj;
+}
+
+QDict *qtest_qmpv(QTestState *s, const char *fmt, va_list ap)
+{
+QMPResponseParser qmp;
 
 /* Send QMP request */
 socket_sendf(s->qmp_fd, fmt, ap);
 
 /* Receive reply */
-while (!has_reply || nesting > 0) {
+qmp.response = NULL;
+json_message_parser_init(&qmp.parser, qmp_response);
+while (!qmp.response) {
 ssize_t len;
 char c;
 
@@ -314,25 +338,39 @@ void qtest_qmpv_discard_response(QTestState *s, const 
char *fmt, va_list ap)
 exit(1);
 }
 
-switch (c) {
-case '{':
-nesting++;
-has_reply = true;
-break;
-case '}':
-nesting--;
-break;
-}
+json_message_parser_feed(&qmp.parser, &c, 1);
 }
+json_message_parser_destroy(&qmp.parser);
+
+return qmp.response;
+}
+
+QDict *qtest_qmp(QTestState *s, const char *fmt, ...)
+{
+va_list ap;
+QDict *response;
+
+va_start(ap, fmt);
+response = qtest_qmpv(s, fmt, ap);
+va_end(ap);
+return response;
+}
+
+void qtest_qmpv_discard_response(QTestState *s, const char *fmt, va_list ap)
+{
+QDict *response = qtest_qmpv(s, fmt, ap);
+QDECREF(response);
 }
 
 void qtest_qmp_discard_response(QTestState *s, const char *fmt, ...)
 {
 va_list ap;
+QDict *response;
 
 va_start(ap, fmt);
-qtest_qmpv_discard_response(s, fmt, ap);
+response = qtest_qmpv(s, fmt, ap);
 va_end(ap);
+QDECREF(response);
 }
 
 const char *qtest_get_arch(void)
diff --git a/tests/libqtest.h b/tests/libqtest.h
index 4f1b060..9deebdc 100644
--- a/tests/libqtest.h
+++ b/tests/libqtest.h
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include "qapi/qmp/qdict.h"
 
 typedef struct QTestState QTestState;
 
@@ -53,6 +54,15 @@ void qtest_quit(QTestState *s);
 void qtest_qmp_discard_response(QTestState *s, const char *fmt, ...);
 
 /**
+ * qtest_qmp:
+ * @s: #QTestState instance to operate on.
+ * @fmt...: QMP message to send to qemu
+ *
+ * Sends a QMP message to QEMU and returns the response.
+ */
+QDict *qtest_qmp(QTestState *s, const char *fmt, ...);
+
+/**
  * qtest_qmpv_discard_response:
  * @s: #QTestState instance to operate on.
  * @fmt: QMP message to send to QEMU
@@ -63,6 +73,16 @@ void qtest_qmp_discard_response(QTestState *s, const char 
*fmt, ...);
 void qtest_qmpv_discard_response(QTestState *s, const char *fmt, va_list ap);
 
 /**
+ * qtest_qmpv:
+ * @s: #QTestState instance to operate on.
+ * @fmt: QMP message to send to QEMU
+ * @ap: QMP message arguments
+ *
+ * Sends a QMP message to QEMU and returns the response.
+ */
+QDict *qtest_qmpv(QTestState *s, const char *fmt, va_list ap);
+
+/**
  * qtest_get_irq:
  * @s: #QTestState instance to operate on.
  * @num: Interrupt to observe.
@@ -331,6 +351,23 @@ static inline void qtest_end(void)
 }
 
 /**
+ * qmp:
+ * @fmt...: QMP message to send to qemu
+ *
+ * Sends a QMP message to QEMU and returns the response.
+ */
+static inline QDict *qmp(const char *fmt, ...)
+{
+va_list ap;
+QDict *response;
+
+va_start(ap, fmt);
+response = qtest_qmpv(global_qtest, fmt, ap);
+va_end(ap);
+return response;
+}
+
+/**
  * qmp_discard_response:
  * @fmt...: QMP message to send to qemu
  *
-- 
1.8.3.1




[Qemu-devel] [PULL 20/36] block: vhdx code movement - VHDXMetadataEntries and BDRVVHDXState to header.

2013-11-07 Thread Stefan Hajnoczi
From: Jeff Cody 

In preparation for VHDX log support, move these structures to the
header.

Signed-off-by: Jeff Cody 
Signed-off-by: Stefan Hajnoczi 
---
 block/vhdx.c | 52 
 block/vhdx.h | 48 
 2 files changed, 48 insertions(+), 52 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 7b94c42..2c921cf 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -105,16 +105,6 @@ static const MSGUID parent_vhdx_guid = { .data1 = 
0xb04aefb7,
  META_PAGE_83_PRESENT | META_LOGICAL_SECTOR_SIZE_PRESENT | \
  META_PHYS_SECTOR_SIZE_PRESENT)
 
-typedef struct VHDXMetadataEntries {
-VHDXMetadataTableEntry file_parameters_entry;
-VHDXMetadataTableEntry virtual_disk_size_entry;
-VHDXMetadataTableEntry page83_data_entry;
-VHDXMetadataTableEntry logical_sector_size_entry;
-VHDXMetadataTableEntry phys_sector_size_entry;
-VHDXMetadataTableEntry parent_locator_entry;
-uint16_t present;
-} VHDXMetadataEntries;
-
 
 typedef struct VHDXSectorInfo {
 uint32_t bat_idx;   /* BAT entry index */
@@ -125,48 +115,6 @@ typedef struct VHDXSectorInfo {
 uint64_t block_offset;  /* block offset, in bytes */
 } VHDXSectorInfo;
 
-
-
-typedef struct BDRVVHDXState {
-CoMutex lock;
-
-int curr_header;
-VHDXHeader *headers[2];
-
-VHDXRegionTableHeader rt;
-VHDXRegionTableEntry bat_rt; /* region table for the BAT */
-VHDXRegionTableEntry metadata_rt;/* region table for the metadata */
-
-VHDXMetadataTableHeader metadata_hdr;
-VHDXMetadataEntries metadata_entries;
-
-VHDXFileParameters params;
-uint32_t block_size;
-uint32_t block_size_bits;
-uint32_t sectors_per_block;
-uint32_t sectors_per_block_bits;
-
-uint64_t virtual_disk_size;
-uint32_t logical_sector_size;
-uint32_t physical_sector_size;
-
-uint64_t chunk_ratio;
-uint32_t chunk_ratio_bits;
-uint32_t logical_sector_size_bits;
-
-uint32_t bat_entries;
-VHDXBatEntry *bat;
-uint64_t bat_offset;
-
-MSGUID session_guid;
-
-
-VHDXParentLocatorHeader parent_header;
-VHDXParentLocatorEntry *parent_entries;
-
-Error *migration_blocker;
-} BDRVVHDXState;
-
 /* Calculates new checksum.
  *
  * Zero is substituted during crc calculation for the original crc field
diff --git a/block/vhdx.h b/block/vhdx.h
index 403f766..2f7461d 100644
--- a/block/vhdx.h
+++ b/block/vhdx.h
@@ -308,6 +308,54 @@ typedef struct QEMU_PACKED VHDXParentLocatorEntry {
 
 /* - END VHDX SPECIFICATION STRUCTURES  */
 
+typedef struct VHDXMetadataEntries {
+VHDXMetadataTableEntry file_parameters_entry;
+VHDXMetadataTableEntry virtual_disk_size_entry;
+VHDXMetadataTableEntry page83_data_entry;
+VHDXMetadataTableEntry logical_sector_size_entry;
+VHDXMetadataTableEntry phys_sector_size_entry;
+VHDXMetadataTableEntry parent_locator_entry;
+uint16_t present;
+} VHDXMetadataEntries;
+
+typedef struct BDRVVHDXState {
+CoMutex lock;
+
+int curr_header;
+VHDXHeader *headers[2];
+
+VHDXRegionTableHeader rt;
+VHDXRegionTableEntry bat_rt; /* region table for the BAT */
+VHDXRegionTableEntry metadata_rt;/* region table for the metadata */
+
+VHDXMetadataTableHeader metadata_hdr;
+VHDXMetadataEntries metadata_entries;
+
+VHDXFileParameters params;
+uint32_t block_size;
+uint32_t block_size_bits;
+uint32_t sectors_per_block;
+uint32_t sectors_per_block_bits;
+
+uint64_t virtual_disk_size;
+uint32_t logical_sector_size;
+uint32_t physical_sector_size;
+
+uint64_t chunk_ratio;
+uint32_t chunk_ratio_bits;
+uint32_t logical_sector_size_bits;
+
+uint32_t bat_entries;
+VHDXBatEntry *bat;
+uint64_t bat_offset;
+
+MSGUID session_guid;
+
+VHDXParentLocatorHeader parent_header;
+VHDXParentLocatorEntry *parent_entries;
+
+Error *migration_blocker;
+} BDRVVHDXState;
 
 void vhdx_guid_generate(MSGUID *guid);
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] block: Print its file name if backing file opening failed

2013-11-07 Thread Eric Blake
On 11/07/2013 12:34 AM, Fam Zheng wrote:
> If backing file doesn't exist, the error message is confusing and
> misleading:
> 

> 
> This is not intuitive. It's better to have the missing file's name in
> the error message. With this patch:
> 
> $ qemu-io -c 'read 0 512' /tmp/a.qcow2
> qemu-io: can't open device /tmp/a.qcow2: Could not open backing
> file: Could not open '/stor/vm/arch.raw': No such file or directory
> no file open, try 'help open'
> 
> Which is a little bit better.

Indeed.  Are you trying to get this into 1.7?

> 
> Signed-off-by: Fam Zheng 
> ---
>  block.c| 3 ++-
>  block/raw-posix.c  | 1 -
>  block/raw-win32.c  | 1 -
>  tests/qemu-iotests/051.out | 2 +-
>  tests/qemu-iotests/069.out | 2 +-
>  5 files changed, 4 insertions(+), 5 deletions(-)

> 
> diff --git a/block.c b/block.c
> index f706634..a8dbcfc 100644
> --- a/block.c
> +++ b/block.c
> @@ -1009,7 +1009,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
> *options, Error **errp)
>  bdrv_unref(bs->backing_hd);
>  bs->backing_hd = NULL;
>  bs->open_flags |= BDRV_O_NO_BACKING;
> -error_propagate(errp, local_err);
> +error_setg(errp, "Could not open backing file: %s",
> +   error_get_pretty(local_err));
>  return ret;

Needs a v2, that doesn't leak local_err.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


  1   2   >