date:20140728

On Fri, 25 Jul 2014 19:56:40 +0200
Laszlo Ersek ler...@redhat.com wrote:

 On 07/25/14 17:48, Igor Mammedov wrote:
 
  Add API to mark memory region as extend-able on migration,
  to allow migration code to load smaller RAMBlock into
  a bigger one on destination QEMU instance.
 
  This will allow to fix broken migration from QEMU 1.7/2.0 to
  QEMU 2.1 due to  ACPI tables size changes across 1.7/2.0/2.1
  versions by marking ACPI tables ROM blob as extend-able.
  So that smaller tables from previos version could be always
 
 (previous)
 
  migrated to a bigger rom blob on new version.
 
  Credits-for-idea: Michael S. Tsirkin m...@redhat.com
  Signed-off-by: Igor Mammedov imamm...@redhat.com
  ---
   arch_init.c | 19 ++-
   exec.c  | 15 +--
   include/exec/cpu-all.h  | 15 ++-
   include/exec/memory.h   | 10 ++
   include/exec/ram_addr.h |  1 +
   memory.c|  5 +
   6 files changed, 53 insertions(+), 12 deletions(-)
 
 (Please pass the -O flag to git-format-patch, with an order file that
 moves header files (*.h) to the front. Header hunks should lead in a
 patch. Thanks.)
 
  diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
  index 6593be1..248c075 100644
  --- a/include/exec/ram_addr.h
  +++ b/include/exec/ram_addr.h
  @@ -33,6 +33,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr);
   void *qemu_get_ram_ptr(ram_addr_t addr);
   void qemu_ram_free(ram_addr_t addr);
   void qemu_ram_free_from_ptr(ram_addr_t addr);
  +void qemu_ram_set_extendable_on_migration(ram_addr_t addr);
 
   static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
ram_addr_t length,
 
 (Ugh. The declarations (prototypes) of qemu_ram_*() functions are
 scattered between ram_addr.h and cpu-common.h (both under
 include/exec). I can't see why that is a good thing (the function
 definitions all seem to be in exec.c).)
Trying to avoid putting stuff to global cpu-common.h, I've put
definition in ram_addr.h. The rest should be moved from cpu-common.h
to ram_addr.h when 2.2 is open.

[...]

 
  diff --git a/arch_init.c b/arch_init.c
  index 8ddaf35..812f8b5 100644
  --- a/arch_init.c
  +++ b/arch_init.c
  @@ -1071,11 +1071,20 @@ static int ram_load(QEMUFile *f, void *opaque, int 
  version_id)
 
   QTAILQ_FOREACH(block, ram_list.blocks, next) {
   if (!strncmp(id, block-idstr, sizeof(id))) {
  -if (block-length != length) {
  -error_report(Length mismatch: %s:  
  RAM_ADDR_FMT
  -  in !=  RAM_ADDR_FMT, id, 
  length,
  - block-length);
  -ret =  -EINVAL;
  +if (block-flags  RAM_EXTENDABLE_ON_MIGRATE) {
  +if (block-length  length) {
  +error_report(Length too big: %s: 
  RAM_ADDR_FMT
 
 missing space between '' and RAM_ADDR_FMT (for readability)
 
 
  +RAM_ADDR_FMT, id, 
  length,
 
 you dropped the important word in here (cf. in != above). That word
 explains it to the user that the incoming size (stored in length) is
 too big.
 
  + block-length);
  +ret =  -EINVAL;
  +}
  +} else {
  +if (block-length != length) {
  +error_report(Length mismatch: %s: 
  RAM_ADDR_FMT
 
 missing space between '' and RAM_ADDR_FMT (for readability)
 
  +  in !=  RAM_ADDR_FMT, id, 
  length,
  + block-length);
  +ret =  -EINVAL;
  +}
   }
   break;
   }
 
 Can you please explain where it is ensured that the non-overwritten part
 of a greater RAMBlock is zero-filled? As far as I can see:
 
 main() [vl.c]
   qemu_run_machine_init_done_notifiers() [vl.c]
 notifier_list_notify() [util/notify.c]
   pc_guest_info_machine_done() [hw/i386/pc.c]
 acpi_setup() [hw/i386/acpi-build.c]
   acpi_add_rom_blob() [hw/i386/acpi-build.c]
 rom_add_blob() [hw/core/loader.c]
   rom_set_mr() [hw/core/loader.c]
 memory_region_init_ram() [memory.c]
   qemu_ram_alloc() [exec.c]
 memcpy()  -- populates it
   fw_cfg_add_file_callback() [hw/nvram/fw_cfg.c]
 fw_cfg_add_bytes_read_callback() [hw/nvram/fw_cfg.c] - 
 stores len
   qemu_start_incoming_migration() [migration.c]
 
 I currently cannot see where the trailing portion of the bigger RAMBlock

Re: [Qemu-devel] [RFC PATCH v2 00/49] Series short description

2014-07-28 Thread Pavel Dovgaluk

 From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
 Bonzini
 Il 17/07/2014 13:01, Pavel Dovgalyuk ha scritto:
  This set of patches is related to the reverse execution and deterministic
  replay of qemu execution  Our implementation of deterministic replay can
  be used for deterministic and reverse debugging of guest code through gdb
  remote interface.
 
  Execution recording writes non-deterministic events log, which can be later
  used for replaying the execution anywhere and for unlimited number of times.
  It also supports checkpointing for faster rewinding during reverse 
  debugging.
  Execution replaying reads the log and replays all non-deterministic events
  including external input, hardware clocks, and interrupts.
 
 From a first look:
 
 - patches 2 to 13 probably should try to use subsections, so that VMs
 that do not use the devices try not to save the extra data and keep
 backwards migration compatibility (at least try to)

 Could you give me and example?
 As I know, subsection is loaded when some predicate function returns true. How 
can I 
construct such a function for integratorcp module? What kind of condition 
should it check?
 In this module I just added missed vmstates (it does not saved/restored at all
by the master version).

 - patch 16 should also use subsections, and perhaps apply to all other
 CPUs too?

 We implemented replay only for i386 and ARM. If we'll change other targets, it 
will not
add record/replay capabilities to them, but can confuse the reviewers.

 - patches 23-24-25 perhaps could try using icount, like Konrad's patch do?

 Using faster icount (like in Konrad's patches) is the our next aim. It 
obviously will 
increase the speed of recording process. But now I submitted slower, but more 
conservative
version of icount which we had already tested.

 - patch 27 makes sense but VIRTUAL is used to skip blinking when the VM
 is stopped

 Right, this is kind of hack. I haven't found better solution yet.

 - the others I haven't yet looked at, but they look like something that
 would bitrot really, really fast.  For patch 33, I think changing
 INSERT_HEAD to INSERT_TAIL would be just fine, and I wonder if it's the
 same for other patches here.  How do you plan on testing them and
 keeping them up to date?

 We're constantly keeping these patches up to date, because we are using 
deterministic replay
and reverse debugging for solving our tasks. We re-test all the features when 
pulling new
patches from the master branch.

Pavel Dovgalyuk

 
  Reverse execution has the following features:
   * Deterministically replays whole system execution and all contents of the 
  memory,
 state of the hadrware devices, clocks, and screen of the VM.
   * Writes execution log into the file for latter replaying for multiple 
  times
 on different machines.
   * Supports i386, x86_64, and ARM hardware platforms.
   * Performs deterministic replay of all operations with keyboard, mouse, 
  network adapters,
 audio devices, serial interfaces, and physical USB devices connected to 
  the emulator.
   * Provides support for gdb reverse debugging commands like reverse-step 
  and reverse-
 continue.
   * Supports auto-checkpointing for convenient reverse debugging.
 
  Usage of the record/replay:
   * First, record the execution, by adding '-record fname=replay.bin' to the
 command line.
   * Then you can replay it for the multiple times by using another command
 line option: '-replay fname=replay.bin'
   * Virtual machine should have at least one virtual disk, which is used to
 store checkpoints. If you want to enable automatic checkpointing, simply
 add ',period=XX' to record options, where XX is the checkpointing period
 in seconds.
   * Using of the network adapters in record/replay mode is possible with
 the following command-line options:
 - '-net user' (or another host adapter) in record mode
 - '-net replay' in replay mode. Every host network adapter should be
   replaced by 'replay' when replaying the execution.
   * Reverse debugging can be used through gdb remote interface.
 reverse-stepi and reverse-continue commands are supported. Other reverse
 commands should also work, because they reuse these ones.
   * Monitor is extended by the following commands:
 - replay_info - prints information about replay mode and current step
   (number of instructions executed)
 - replay_break - sets breakpoint at the specified instructions count.
 - replay_seek - rewinds (using the checkpoints, if possible) to the
   specified step of replay log.
 
  Paper with short description of deterministic replay implementation:
  http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html
 
  Modifications of qemu include:
   * adding missed fields of the virtual devices' states to the vmstate
 structures to allow deterministic saving and restoring the VM state
   * adding virtual

Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.2] spapr: add host Linux version information to device tree



 Am 28.07.2014 um 08:47 schrieb Alexey Kardashevskiy a...@ozlabs.ru:
 
 On 07/24/2014 11:15 PM, Alexander Graf wrote:
 
 On 18.07.14 06:31, cyril...@gmail.com wrote:
 It may prove useful know which Linux distribution version the host machine
 is running when an issue in the guest arises but a user cannot access
 the host.
 
 Signed-off-by: Cyril Bur cyril@au1.ibm.com
 ---
  hw/ppc/spapr.c   |  8 +++
  target-ppc/kvm.c | 62
 
  target-ppc/kvm_ppc.h |  6 +
  3 files changed, 76 insertions(+)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 6b48a26..391d47a 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -375,6 +375,14 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
  _FDT((fdt_property_string(fdt, vm,uuid, buf)));
  g_free(buf);
  +/*
 + * Add info to the guest FDT to tell it what linux the host is
 + */
 +if (kvmppc_get_linux_host(buf)) {
 +_FDT((fdt_property_string(fdt, linux,host, buf)));
 
 Is this even specified in sPAPR?
 
 
 PAPR does not know about any linux,xxx properties.
 
 
 +g_free(buf);
 +}
 +
  _FDT((fdt_property_cell(fdt, #address-cells, 0x2)));
  _FDT((fdt_property_cell(fdt, #size-cells, 0x2)));
  diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
 index 8c9e79c..95e0970 100644
 --- a/target-ppc/kvm.c
 +++ b/target-ppc/kvm.c
 @@ -1415,6 +1415,68 @@ bool kvmppc_get_host_model(char **value)
  return g_file_get_contents(/proc/device-tree/model, value, NULL,
 NULL);
  }
  +bool kvmppc_get_linux_host(char **value)
 +{
 +FILE *f;
 +int i;
 +char line[512];
 +const char *names[] = {NAME, VERSION, BUILD_ID};
 +bool names_found[ARRAY_SIZE(names)] = { 0 };
 +GString *output = NULL;
 +f = fopen(/etc/os-release, r);
 
 A few comments:
 
  1) Why would anyone care?
 
 
 Useful debug info when the host is not reachable.
 
 
  2) I'm not sure KVM is the right decision maker on whether we want this
 exposed or not.
 
 Good point, there is no reason not to show this info in TCG.
 
 After all, the files you read here are available on an x86
 host just as well
 
 Does qemu-x86 has any way to pass information like this to the guest?

Yes and no. It has fw_cfg where we could put it and it has DSST generation 
where we could also put it I guess. Just like here, there's no standardized way 
to expose it though.

Alex

 
 
  3) Use glib functions to read files
 
 
 Alex
 
 
 -- 
 Alexey

Re: [Qemu-devel] [PATCH for-2.1 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration

On Fri, 25 Jul 2014 19:37:41 +0200
Paolo Bonzini pbonz...@redhat.com wrote:

 Il 25/07/2014 17:48, Igor Mammedov ha scritto:
  It fixes migration failure for machine type pc-i440fx-1.7 from
  QEMU 1.7/2.0 to QEMU 2.1
  
  Migration fails due to ACPI tables size grows across 1.7-2.1
  versions. That causes ACPI tables ROM blob to change its size
  differently for the same configurations on different QEMU versions.
  As result migration code bails out when attempting to load
  smaller ROM blob into a bigger one on a newer QEMU.
  To trigger failure it's enough to add pci-bridge device to QEMU CLI
  
  Marking ACPI tables ROM blob as extend-able on migration allows
  QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing
  forward migration failure introduced since 2.0 which affects
  only configurations that cause ACPI ROM blob size change.
 
 This works in this case, and it is more friendly to downstreams indeed.
  It also is simpler, at least on the surface.  I think the ramifications
 could be a bit larger than with my own patches, but still I guess it's
 more appropriate at this point of the release cycle.
 
 It doesn't handle the case of ACPI tables that shrink, which can happen
 as well.  I guess if this ever happens we can just hard-code the table
 size of the old versions to something big enough (64K?) and keep using
 fine-grained sizing.
That was intentionally omitted in here so far size only goes up from 1.7 to
2.1. My though was that can enforce minimum size later during 2.2 cycle
Anyway, I'll think more about it, and maybe post additional patch
on top of this to set minimum size if I find a reason for it to be in 2.1.


 
 I'd like a day or two to mull about it, but I have it even if the
 patches are applied.  Peter, feel free to go ahead with Igor's patches.
 
 Paolo

[Qemu-devel] [PATCH for-2.1 v2 0/2] Fix migration failure due to ACPI tables size changes


Changes since v2:
  - addressed Laszlo's comments
 * fixing typos, rewording comments
 * dropping enum-ification of RAMBlock flags
 * adding zeroing out destination ramblock
 * replacing 'if' with assert() 

Changing the ACPI table size causes migration to break, and the memory
hotplug work opened our eyes on how horribly we were breaking things in
2.0 already.

To trigger issue start
  QEMU-1.7 with -M pc-i440fx-1.7 -device pci-bridge,chassis_nr=1
and try to migrate to QEMU-2.1 or QEMU-2.0 as result target will fail with:
  qemu-system-x86_64: Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000

This fix allows target QEMU to load smaller RAMBlock into a bigger one
and fixes regression which was introduced since 2.0, allowing
forward migration from 1.7/2.0 to 2.1
Fix is also suitable for stable-2.0.


Igor Mammedov (2):
  migration: load smaller RAMBlock to a bigger one if permitted
  acpi: mark ACPI tables ROM blob as extend-able on migration

 arch_init.c | 22 +-
 exec.c  |  8 
 hw/core/loader.c|  6 +-
 hw/i386/acpi-build.c|  2 +-
 include/exec/memory.h   | 11 +++
 include/exec/ram_addr.h |  3 +++
 include/hw/loader.h |  5 +++--
 memory.c|  5 +
 8 files changed, 53 insertions(+), 9 deletions(-)

-- 
1.8.3.1

[Qemu-devel] [PATCH for-2.1 v2 1/2] migration: load smaller RAMBlock to a bigger one if permitted

Add API to mark memory region as extend-able on migration,
to allow migration code to load smaller RAMBlock into
a bigger one on destination QEMU instance.

This will allow to fix broken migration from QEMU 1.7/2.0 to
QEMU 2.1 due to  ACPI tables size changes across 1.7/2.0/2.1
versions by marking ACPI tables ROM blob as extend-able.
So that smaller tables from previous version could be always
migrated to a bigger rom blob on new version.

Credits-for-idea: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Igor Mammedov imamm...@redhat.com
---
v2:
  fixed patch as suggested by Laszlo
---
 arch_init.c | 22 +-
 exec.c  |  8 
 include/exec/memory.h   | 11 +++
 include/exec/ram_addr.h |  3 +++
 memory.c|  5 +
 5 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 8ddaf35..2c0c238 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -1071,11 +1071,23 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 
 QTAILQ_FOREACH(block, ram_list.blocks, next) {
 if (!strncmp(id, block-idstr, sizeof(id))) {
-if (block-length != length) {
-error_report(Length mismatch: %s:  RAM_ADDR_FMT
-  in !=  RAM_ADDR_FMT, id, length,
- block-length);
-ret =  -EINVAL;
+if (block-flags  RAM_EXTENDABLE_ON_MIGRATE) {
+if (block-length  length) {
+error_report(Length too big: %s:  
RAM_ADDR_FMT
+  in   RAM_ADDR_FMT, id, length,
+ block-length);
+ret =  -EINVAL;
+} else {
+memset(block-host, 0, block-length);
+}
+} else {
+if (block-length != length) {
+error_report(Length mismatch: %s: 
+ RAM_ADDR_FMT  in != 
+ RAM_ADDR_FMT,
+ id, length, block-length);
+ret =  -EINVAL;
+}
 }
 break;
 }
diff --git a/exec.c b/exec.c
index 765bd94..02536f8e 100644
--- a/exec.c
+++ b/exec.c
@@ -1214,6 +1214,14 @@ void qemu_ram_unset_idstr(ram_addr_t addr)
 }
 }
 
+void qemu_ram_set_extendable_on_migration(ram_addr_t addr)
+{
+RAMBlock *block = find_ram_block(addr);
+
+assert(block != NULL);
+block-flags |= RAM_EXTENDABLE_ON_MIGRATE;
+}
+
 static int memory_try_enable_merging(void *addr, size_t len)
 {
 if (!qemu_opt_get_bool(qemu_get_machine_opts(), mem-merge, true)) {
diff --git a/include/exec/memory.h b/include/exec/memory.h
index e2c8e3e..f96ddbb 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -894,6 +894,17 @@ bool memory_region_present(MemoryRegion *container, hwaddr 
addr);
 bool memory_region_is_mapped(MemoryRegion *mr);
 
 /**
+ * memory_region_permit_extendable_migration: marks #MemoryRegion
+ * as extendable on migration, allowing the migration code to load
+ * source memory block of smaller size than destination memory block
+ * at migration time
+ *
+ * @mr: a #MemoryRegion whose #RAMBlock should be marked as
+ *  extendable on migration
+ */
+void memory_region_permit_extendable_migration(MemoryRegion *mr);
+
+/**
  * memory_region_find: translate an address/size relative to a
  * MemoryRegion into a #MemoryRegionSection.
  *
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 6593be1..7a6b782 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -34,6 +34,9 @@ void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
 void qemu_ram_free_from_ptr(ram_addr_t addr);
 
+#define RAM_EXTENDABLE_ON_MIGRATE (1U  31)
+void qemu_ram_set_extendable_on_migration(ram_addr_t addr);
+
 static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
  ram_addr_t length,
  unsigned client)
diff --git a/memory.c b/memory.c
index 64d7176..744c746 100644
--- a/memory.c
+++ b/memory.c
@@ -1791,6 +1791,11 @@ bool memory_region_is_mapped(MemoryRegion *mr)
 return mr-container ? true : false;
 }
 
+void memory_region_permit_extendable_migration(MemoryRegion *mr)
+{
+qemu_ram_set_extendable_on_migration(mr-ram_addr);
+}
+
 MemoryRegionSection memory_region_find(MemoryRegion *mr,
hwaddr addr, uint64_t size)
 {
-- 
1.8.3.1

[Qemu-devel] [PATCH for-2.1 v2 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration

It fixes migration failure for machine type pc-i440fx-1.7 from
QEMU 1.7/2.0 to QEMU 2.1

Migration fails due to ACPI tables size grows across 1.7-2.1
versions. That causes ACPI tables ROM blob to change its size
differently for the same configurations on different QEMU versions.
As result migration code bails out when attempting to load
smaller ROM blob into a bigger one on a newer QEMU.
To trigger failure it's enough to add pci-bridge device to QEMU CLI

Marking ACPI tables ROM blob as extend-able on migration allows
QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing
forward migration failure introduced since 2.0 which affects
only configurations that cause ACPI ROM blob size change.

Signed-off-by: Igor Mammedov imamm...@redhat.com
Reviewed-by: Laszlo Ersek ler...@redhat.com
---
 hw/core/loader.c | 6 +-
 hw/i386/acpi-build.c | 2 +-
 include/hw/loader.h  | 5 +++--
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 2bf6b8f..d09b894 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -722,7 +722,8 @@ err:
 
 void *rom_add_blob(const char *name, const void *blob, size_t len,
hwaddr addr, const char *fw_file_name,
-   FWCfgReadCallback fw_callback, void *callback_opaque)
+   FWCfgReadCallback fw_callback, void *callback_opaque,
+   bool extendable)
 {
 Rom *rom;
 void *data = NULL;
@@ -742,6 +743,9 @@ void *rom_add_blob(const char *name, const void *blob, 
size_t len,
 
 if (rom_file_has_mr) {
 data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
+if (extendable) {
+memory_region_permit_extendable_migration(rom-mr);
+}
 } else {
 data = rom-data;
 }
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ebc5f03..651c06b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1558,7 +1558,7 @@ static void *acpi_add_rom_blob(AcpiBuildState 
*build_state, GArray *blob,
const char *name)
 {
 return rom_add_blob(name, blob-data, acpi_data_len(blob), -1, name,
-acpi_build_update, build_state);
+acpi_build_update, build_state, true);
 }
 
 static const VMStateDescription vmstate_acpi_build = {
diff --git a/include/hw/loader.h b/include/hw/loader.h
index 796cbf9..e5a24ac 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -57,7 +57,8 @@ int rom_add_file(const char *file, const char *fw_dir,
  bool option_rom);
 void *rom_add_blob(const char *name, const void *blob, size_t len,
hwaddr addr, const char *fw_file_name,
-   FWCfgReadCallback fw_callback, void *callback_opaque);
+   FWCfgReadCallback fw_callback, void *callback_opaque,
+   bool extendable);
 int rom_add_elf_program(const char *name, void *data, size_t datasize,
 size_t romsize, hwaddr addr);
 int rom_load_all(void);
@@ -70,7 +71,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict);
 #define rom_add_file_fixed(_f, _a, _i)  \
 rom_add_file(_f, NULL, _a, _i, false)
 #define rom_add_blob_fixed(_f, _b, _l, _a)  \
-rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL)
+rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL, false)
 
 #define PC_ROM_MIN_VGA 0xc
 #define PC_ROM_MIN_OPTION  0xc8000
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH for-2.1 1/2] migration: load smaller RAMBlock to a bigger one if permitted

2014-07-28 Thread Laszlo Ersek

On 07/28/14 09:40, Igor Mammedov wrote:
 On Fri, 25 Jul 2014 19:56:40 +0200
 Laszlo Ersek ler...@redhat.com wrote:
 
 On 07/25/14 17:48, Igor Mammedov wrote:

 Add API to mark memory region as extend-able on migration,
 to allow migration code to load smaller RAMBlock into
 a bigger one on destination QEMU instance.

 This will allow to fix broken migration from QEMU 1.7/2.0 to
 QEMU 2.1 due to  ACPI tables size changes across 1.7/2.0/2.1
 versions by marking ACPI tables ROM blob as extend-able.
 So that smaller tables from previos version could be always

 (previous)

 migrated to a bigger rom blob on new version.

 Credits-for-idea: Michael S. Tsirkin m...@redhat.com
 Signed-off-by: Igor Mammedov imamm...@redhat.com
 ---
  arch_init.c | 19 ++-
  exec.c  | 15 +--
  include/exec/cpu-all.h  | 15 ++-
  include/exec/memory.h   | 10 ++
  include/exec/ram_addr.h |  1 +
  memory.c|  5 +
  6 files changed, 53 insertions(+), 12 deletions(-)

 (Please pass the -O flag to git-format-patch, with an order file that
 moves header files (*.h) to the front. Header hunks should lead in a
 patch. Thanks.)

 diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
 index 6593be1..248c075 100644
 --- a/include/exec/ram_addr.h
 +++ b/include/exec/ram_addr.h
 @@ -33,6 +33,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr);
  void *qemu_get_ram_ptr(ram_addr_t addr);
  void qemu_ram_free(ram_addr_t addr);
  void qemu_ram_free_from_ptr(ram_addr_t addr);
 +void qemu_ram_set_extendable_on_migration(ram_addr_t addr);

  static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
   ram_addr_t length,

 (Ugh. The declarations (prototypes) of qemu_ram_*() functions are
 scattered between ram_addr.h and cpu-common.h (both under
 include/exec). I can't see why that is a good thing (the function
 definitions all seem to be in exec.c).)
 Trying to avoid putting stuff to global cpu-common.h, I've put
 definition in ram_addr.h. The rest should be moved from cpu-common.h
 to ram_addr.h when 2.2 is open.
 
 [...]
 

 diff --git a/arch_init.c b/arch_init.c
 index 8ddaf35..812f8b5 100644
 --- a/arch_init.c
 +++ b/arch_init.c
 @@ -1071,11 +1071,20 @@ static int ram_load(QEMUFile *f, void *opaque, int 
 version_id)

  QTAILQ_FOREACH(block, ram_list.blocks, next) {
  if (!strncmp(id, block-idstr, sizeof(id))) {
 -if (block-length != length) {
 -error_report(Length mismatch: %s:  
 RAM_ADDR_FMT
 -  in !=  RAM_ADDR_FMT, id, 
 length,
 - block-length);
 -ret =  -EINVAL;
 +if (block-flags  RAM_EXTENDABLE_ON_MIGRATE) {
 +if (block-length  length) {
 +error_report(Length too big: %s: 
 RAM_ADDR_FMT

 missing space between '' and RAM_ADDR_FMT (for readability)


 +RAM_ADDR_FMT, id, 
 length,

 you dropped the important word in here (cf. in != above). That word
 explains it to the user that the incoming size (stored in length) is
 too big.

 + block-length);
 +ret =  -EINVAL;
 +}
 +} else {
 +if (block-length != length) {
 +error_report(Length mismatch: %s: 
 RAM_ADDR_FMT

 missing space between '' and RAM_ADDR_FMT (for readability)

 +  in !=  RAM_ADDR_FMT, id, 
 length,
 + block-length);
 +ret =  -EINVAL;
 +}
  }
  break;
  }

 Can you please explain where it is ensured that the non-overwritten part
 of a greater RAMBlock is zero-filled? As far as I can see:

 main() [vl.c]
   qemu_run_machine_init_done_notifiers() [vl.c]
 notifier_list_notify() [util/notify.c]
   pc_guest_info_machine_done() [hw/i386/pc.c]
 acpi_setup() [hw/i386/acpi-build.c]
   acpi_add_rom_blob() [hw/i386/acpi-build.c]
 rom_add_blob() [hw/core/loader.c]
   rom_set_mr() [hw/core/loader.c]
 memory_region_init_ram() [memory.c]
   qemu_ram_alloc() [exec.c]
 memcpy()  -- populates it
   fw_cfg_add_file_callback() [hw/nvram/fw_cfg.c]
 fw_cfg_add_bytes_read_callback() [hw/nvram/fw_cfg.c] - 
 stores len
   qemu_start_incoming_migration() [migration.c]

 I currently cannot see where the trailing portion of the bigger RAMBlock
 would be overwritten with zeroes.

Re: [Qemu-devel] [PATCH] hw/audio/intel-hda: Fix MSI capability address

Il 27/07/2014 08:57, Jan Kiszka ha scritto:
 From: Jan Kiszka jan.kis...@siemens.com
 
 According to ICH9 spec, the MSI capability is located at 0x60. This is
 important for guest drivers that do not parse the capability chain and
 use absolute addresses instead.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/audio/intel-hda.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
 index aa49b47..09c4118 100644
 --- a/hw/audio/intel-hda.c
 +++ b/hw/audio/intel-hda.c
 @@ -1141,7 +1141,7 @@ static int intel_hda_init(PCIDevice *pci)
intel-hda, 0x4000);
  pci_register_bar(d-pci, 0, 0, d-mmio);
  if (d-msi) {
 -msi_init(d-pci, 0x50, 1, true, false);
 +msi_init(d-pci, 0x60, 1, true, false);
  }
 
  hda_codec_bus_init(DEVICE(pci), d-codecs, sizeof(d-codecs),
 

Does this need a compat property?

Paolo

Re: [Qemu-devel] [PATCH] hw/audio/intel-hda: Fix MSI capability address

2014-07-28 Thread Jan Kiszka

On 2014-07-28 10:11, Paolo Bonzini wrote:
 Il 27/07/2014 08:57, Jan Kiszka ha scritto:
 From: Jan Kiszka jan.kis...@siemens.com

 According to ICH9 spec, the MSI capability is located at 0x60. This is
 important for guest drivers that do not parse the capability chain and
 use absolute addresses instead.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/audio/intel-hda.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
 index aa49b47..09c4118 100644
 --- a/hw/audio/intel-hda.c
 +++ b/hw/audio/intel-hda.c
 @@ -1141,7 +1141,7 @@ static int intel_hda_init(PCIDevice *pci)
intel-hda, 0x4000);
  pci_register_bar(d-pci, 0, 0, d-mmio);
  if (d-msi) {
 -msi_init(d-pci, 0x50, 1, true, false);
 +msi_init(d-pci, 0x60, 1, true, false);
  }

  hda_codec_bus_init(DEVICE(pci), d-codecs, sizeof(d-codecs),

 
 Does this need a compat property?

Sigh, right, it's guest visible.

Jan




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-2.1 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration

Il 28/07/2014 09:56, Igor Mammedov ha scritto:
  It doesn't handle the case of ACPI tables that shrink, which can happen
  as well.  I guess if this ever happens we can just hard-code the table
  size of the old versions to something big enough (64K?) and keep using
  fine-grained sizing.
 That was intentionally omitted in here so far size only goes up from 1.7 to
 2.1. My though was that can enforce minimum size later during 2.2 cycle
 Anyway, I'll think more about it, and maybe post additional patch
 on top of this to set minimum size if I find a reason for it to be in 2.1.

No, there's no need for it in 2.1.

Paolo

Re: [Qemu-devel] [PATCH for-2.1 v2 1/2] migration: load smaller RAMBlock to a bigger one if permitted

2014-07-28 Thread Laszlo Ersek

On 07/28/14 10:03, Igor Mammedov wrote:
 Add API to mark memory region as extend-able on migration,
 to allow migration code to load smaller RAMBlock into
 a bigger one on destination QEMU instance.
 
 This will allow to fix broken migration from QEMU 1.7/2.0 to
 QEMU 2.1 due to  ACPI tables size changes across 1.7/2.0/2.1
 versions by marking ACPI tables ROM blob as extend-able.
 So that smaller tables from previous version could be always
 migrated to a bigger rom blob on new version.
 
 Credits-for-idea: Michael S. Tsirkin m...@redhat.com
 Signed-off-by: Igor Mammedov imamm...@redhat.com
 ---
 v2:
   fixed patch as suggested by Laszlo
 ---
  arch_init.c | 22 +-
  exec.c  |  8 
  include/exec/memory.h   | 11 +++
  include/exec/ram_addr.h |  3 +++
  memory.c|  5 +
  5 files changed, 44 insertions(+), 5 deletions(-)

Thank you.

Reviewed-by: Laszlo Ersek ler...@redhat.com

Re: [Qemu-devel] Possible null-ptr dereference

2014-07-28 Thread mateusz.krzywicki

Hey,

Yup, thanks, task closed ;-)

Best regards,
Mateusz Krzywicki

From: arei.gong...@huawei.com
To: mateusz.krzywi...@windowslive.com; qemu-devel@nongnu.org
CC: stefa...@redhat.com; kw...@redhat.com
Subject: RE: [Qemu-devel] Possible null-ptr dereference
Date: Mon, 28 Jul 2014 06:03:45 +









Hi,
 
Should be easy to fix though. Does the following help?
 
(Cc’ing Stefan  Kevin)
 
--
xen_disk:  fix possible null-ptr dereference

 
Signed-off-by: Gonglei arei.gong...@huawei.com
---
hw/block/xen_disk.c | 1 +
1
 file changed, 1 insertion(+)
 
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index aed5b5b..a221d0b 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -589,6 +589,7 @@ static int blk_send_response_one(struct ioreq *ioreq)
 break;
 default:
 dst = NULL;
+return 0;
 }
 memcpy(dst, resp, sizeof(resp));
 blkdev-rings.common.rsp_prod_pvt++;
--
 
Best regards,
-Gonglei

 



From: qemu-devel-bounces+arei.gonglei=huawei@nongnu.org 
[mailto:qemu-devel-bounces+arei.gonglei=huawei@nongnu.org]
On Behalf Of mateusz.krzywi...@windowslive.com

Sent: Saturday, July 26, 2014 6:52 PM

To: qemu-devel@nongnu.org

Subject: [Qemu-devel] Possible null-ptr dereference


 


Hey,


 


Found a little bug in latest qemu:


 


In function: 


static int blk_send_response_one(struct ioreq *ioreq)


 


File: 


qemu\hw\block\xen_disk.c


 


Code:


 



default:


dst = NULL;


}


memcpy(dst, resp, sizeof(resp));



 


 


Just add simple check for dst and it will be all cool ;-)


 


Best regards,


Mateusz Krzywicki

Re: [Qemu-devel] [PATCH v2 1/7] bootindex: add modify_boot_device_path function

2014-07-28 Thread Gerd Hoffmann

  Hi,

  ... because you can just copy the suffix from the old entry here,
  instead of expecting the caller pass it in.
  
 Okay, agreed.
 
 But we should also think about the situation which a device don't have
 old entry in global fw_boot_order list.

Throw an error?

I think it is ok to allow only *changing* the bootindex.

cheers,
  Gerd

Re: [Qemu-devel] [RFC PATCH v2 08/49] hpet: fixing saving and loading process

Il 17/07/2014 13:02, Pavel Dovgalyuk ha scritto:
 VM clock does not run while saving, so there is no need for saving the ticks
 in HPET. Also added saving of hpet_offset field.
 
 Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru
 ---
  hw/timer/hpet.c |   13 +
  1 files changed, 1 insertions(+), 12 deletions(-)
 
 diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
 index e160e8f..73401b9 100644
 --- a/hw/timer/hpet.c
 +++ b/hw/timer/hpet.c
 @@ -222,14 +222,6 @@ static void update_irq(struct HPETTimer *timer, int set)
  }
  }
  
 -static void hpet_pre_save(void *opaque)
 -{
 -HPETState *s = opaque;
 -
 -/* save current counter value */
 -s-hpet_counter = hpet_get_ticks(s);
 -}
 -
  static int hpet_pre_load(void *opaque)
  {
  HPETState *s = opaque;
 @@ -255,9 +247,6 @@ static int hpet_post_load(void *opaque, int version_id)
  {
  HPETState *s = opaque;
  
 -/* Recalculate the offset between the main counter and guest time */
 -s-hpet_offset = ticks_to_ns(s-hpet_counter) - 
 qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 -
  /* Push number of timers into capability returned via HPET_ID */
  s-capability = ~HPET_ID_NUM_TIM_MASK;
  s-capability |= (s-num_timers - 1)  HPET_ID_NUM_TIM_SHIFT;
 @@ -308,13 +297,13 @@ static const VMStateDescription vmstate_hpet = {
  .name = hpet,
  .version_id = 2,
  .minimum_version_id = 1,
 -.pre_save = hpet_pre_save,
  .pre_load = hpet_pre_load,
  .post_load = hpet_post_load,
  .fields = (VMStateField[]) {
  VMSTATE_UINT64(config, HPETState),
  VMSTATE_UINT64(isr, HPETState),
  VMSTATE_UINT64(hpet_counter, HPETState),
 +VMSTATE_UINT64(hpet_offset, HPETState),

This needs a version bump.

Paolo

  VMSTATE_UINT8_V(num_timers, HPETState, 2),
  VMSTATE_VALIDATE(num_timers in range, hpet_validate_num_timers),
  VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,

Re: [Qemu-devel] [PATCH v2 1/7] bootindex: add modify_boot_device_path function

 -Original Message-
 From: Gerd Hoffmann [mailto:kra...@redhat.com]
 Sent: Monday, July 28, 2014 4:31 PM
 Subject: Re: [PATCH v2 1/7] bootindex: add modify_boot_device_path function

   Hi,

   ... because you can just copy the suffix from the old entry here,
   instead of expecting the caller pass it in.

  Okay, agreed.

  But we should also think about the situation which a device don't have
  old entry in global fw_boot_order list.

 Throw an error?

No. I should firstly use the suffix which the caller pass in IMHO,
just like V3 I have posted, Thanks.

 I think it is ok to allow only *changing* the bootindex.

Yes, that's no problem.

Best regards,
-Gonglei

Re: [Qemu-devel] [PATCH for-2.1 v2 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration

On Mon, 28 Jul 2014 08:03:25 +
Igor Mammedov imamm...@redhat.com wrote:

 It fixes migration failure for machine type pc-i440fx-1.7 from
 QEMU 1.7/2.0 to QEMU 2.1
 
 Migration fails due to ACPI tables size grows across 1.7-2.1
 versions. That causes ACPI tables ROM blob to change its size
 differently for the same configurations on different QEMU versions.
 As result migration code bails out when attempting to load
 smaller ROM blob into a bigger one on a newer QEMU.
 To trigger failure it's enough to add pci-bridge device to QEMU CLI
 
 Marking ACPI tables ROM blob as extend-able on migration allows
 QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing
 forward migration failure introduced since 2.0 which affects
 only configurations that cause ACPI ROM blob size change.
 
 Signed-off-by: Igor Mammedov imamm...@redhat.com
 Reviewed-by: Laszlo Ersek ler...@redhat.com
Self-NACK

I'm sorry for mess, I've tested it on for i386 target,
but patch breaks build for other targets.
I'll resubmit v3 shortly.

 ---
  hw/core/loader.c | 6 +-
  hw/i386/acpi-build.c | 2 +-
  include/hw/loader.h  | 5 +++--
  3 files changed, 9 insertions(+), 4 deletions(-)
 
 diff --git a/hw/core/loader.c b/hw/core/loader.c
 index 2bf6b8f..d09b894 100644
 --- a/hw/core/loader.c
 +++ b/hw/core/loader.c
 @@ -722,7 +722,8 @@ err:
  
  void *rom_add_blob(const char *name, const void *blob, size_t len,
 hwaddr addr, const char *fw_file_name,
 -   FWCfgReadCallback fw_callback, void *callback_opaque)
 +   FWCfgReadCallback fw_callback, void *callback_opaque,
 +   bool extendable)
  {
  Rom *rom;
  void *data = NULL;
 @@ -742,6 +743,9 @@ void *rom_add_blob(const char *name, const void *blob, 
 size_t len,
  
  if (rom_file_has_mr) {
  data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
 +if (extendable) {
 +memory_region_permit_extendable_migration(rom-mr);
 +}
  } else {
  data = rom-data;
  }
 diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
 index ebc5f03..651c06b 100644
 --- a/hw/i386/acpi-build.c
 +++ b/hw/i386/acpi-build.c
 @@ -1558,7 +1558,7 @@ static void *acpi_add_rom_blob(AcpiBuildState 
 *build_state, GArray *blob,
 const char *name)
  {
  return rom_add_blob(name, blob-data, acpi_data_len(blob), -1, name,
 -acpi_build_update, build_state);
 +acpi_build_update, build_state, true);
  }
  
  static const VMStateDescription vmstate_acpi_build = {
 diff --git a/include/hw/loader.h b/include/hw/loader.h
 index 796cbf9..e5a24ac 100644
 --- a/include/hw/loader.h
 +++ b/include/hw/loader.h
 @@ -57,7 +57,8 @@ int rom_add_file(const char *file, const char *fw_dir,
   bool option_rom);
  void *rom_add_blob(const char *name, const void *blob, size_t len,
 hwaddr addr, const char *fw_file_name,
 -   FWCfgReadCallback fw_callback, void *callback_opaque);
 +   FWCfgReadCallback fw_callback, void *callback_opaque,
 +   bool extendable);
  int rom_add_elf_program(const char *name, void *data, size_t datasize,
  size_t romsize, hwaddr addr);
  int rom_load_all(void);
 @@ -70,7 +71,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict);
  #define rom_add_file_fixed(_f, _a, _i)  \
  rom_add_file(_f, NULL, _a, _i, false)
  #define rom_add_blob_fixed(_f, _b, _l, _a)  \
 -rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL)
 +rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL, false)
  
  #define PC_ROM_MIN_VGA 0xc
  #define PC_ROM_MIN_OPTION  0xc8000

Re: [Qemu-devel] [RFC PATCH v2 07/49] kvmapic: fixing loading vmstate

Il 17/07/2014 13:02, Pavel Dovgalyuk ha scritto:
 vapic state should not be synchronized with APIC while loading,
 because APIC state could be not loaded yet at that moment.
 We just save vapic_paddr in APIC VMState instead of synchronization.

This comment is now obsolete:

include/hw/i386/apic_internal.h:hwaddr vapic_paddr; /* note: persistence 
via kvmvapic */

 Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru
 ---
  hw/i386/kvmvapic.c|   22 +-
  hw/intc/apic_common.c |5 -
  2 files changed, 25 insertions(+), 2 deletions(-)
 
 diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
 index cb855c7..417ab6a 100644
 --- a/hw/i386/kvmvapic.c
 +++ b/hw/i386/kvmvapic.c
 @@ -351,6 +351,24 @@ static int get_kpcr_number(X86CPU *cpu)
  return kpcr.number;
  }
  
 +static int vapic_enable_post_load(VAPICROMState *s, X86CPU *cpu)
 +{
 +int cpu_number = get_kpcr_number(cpu);
 +hwaddr vapic_paddr;
 +static const uint8_t enabled = 1;
 +
 +if (cpu_number  0) {
 +return -1;
 +}
 +vapic_paddr = s-vapic_paddr +
 +(((hwaddr)cpu_number)  VAPIC_CPU_SHIFT);
 +cpu_physical_memory_rw(vapic_paddr + offsetof(VAPICState, enabled),
 +   (void *)enabled, sizeof(enabled), 1);
 +s-state = VAPIC_ACTIVE;
 +
 +return 0;
 +}
 +
  static int vapic_enable(VAPICROMState *s, X86CPU *cpu)
  {
  int cpu_number = get_kpcr_number(cpu);
 @@ -731,7 +749,9 @@ static void do_vapic_enable(void *data)
  VAPICROMState *s = data;
  X86CPU *cpu = X86_CPU(first_cpu);
  
 -vapic_enable(s, cpu);
 +/* Do not synchronize with APIC, because it was not loaded yet.
 +   Just call the enable function which does not have synchronization. */
 +vapic_enable_post_load(s, cpu);
  }
  
  static int vapic_post_load(void *opaque, int version_id)
 diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
 index ce3d903..9d75ee0 100644
 --- a/hw/intc/apic_common.c
 +++ b/hw/intc/apic_common.c
 @@ -347,7 +347,7 @@ static int apic_dispatch_post_load(void *opaque, int 
 version_id)
  
  static const VMStateDescription vmstate_apic_common = {
  .name = apic,
 -.version_id = 3,
 +.version_id = 4,
  .minimum_version_id = 3,
  .minimum_version_id_old = 1,
  .load_state_old = apic_load_old,
 @@ -374,6 +374,9 @@ static const VMStateDescription vmstate_apic_common = {
  VMSTATE_INT64(next_time, APICCommonState),
  VMSTATE_INT64(timer_expiry,
APICCommonState), /* open-coded timer state */
 +VMSTATE_INT32_V(sipi_vector, APICCommonState, 4),
 +VMSTATE_INT32_V(wait_for_sipi, APICCommonState, 4),

This could be a subsection.  sipi_vector is only used (needed) if wait_for_sipi 
!= 0.

 +VMSTATE_UINT64_V(vapic_paddr, APICCommonState, 4),

Here you could also use a subsection, where the needed function returns false 
if vapic_paddr == 0.

Paolo


  VMSTATE_END_OF_LIST()
  }
  };

Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc

2014-07-28 Thread Hu Tao

ping...

All the 6 patches have reviewed-by now.

On Fri, Jul 11, 2014 at 02:09:57PM +0800, Hu Tao wrote:
 This series adds two preallocation mode to qcow2 and raw:
 
 Option preallocation=full preallocates disk space for image by writing
 zeros to disk, this ensures disk space in any cases.
 
 Option preallocation=falloc preallocates disk space by calling
 posix_fallocate(). This is faster than preallocation=full.
 
 This series depends on patches 1-3 of Max's series 'qemu-img: Implement
 commit like QMP'. Specifically, patch 6 'qcow2: Add falloc and full
 preallocation option' uses minimal_blob_size() introduced by Max's patch
 'qcow2: Optimize bdrv_make_empty()'.
 
 The series is also at https://github.com/taohu/qemu/commits/preallocation-v12
 for you to check out, including depended patches from Max.
 
 Eric, I'm afraid now we missed qemu 2.1, so patch 1 is still sent
 with this series.
 
 changes to v11:
 
  - fix test case 049 (patch 4)
  - unsigned nl2e - uint64_t nl2e (patch 6)
  - use  instead of / (patch 6)
 
 
 Hu Tao (6):
   block: round up file size to nearest sector
   raw, qcow2: don't convert file size to sector size
   rename parse_enum_option to qapi_enum_parse and make it public
   qapi: introduce PreallocMode and a new PreallocMode full.
   raw-posix: Add falloc and full preallocation option
   qcow2: Add falloc and full preallocation option
 
  block/qcow2.c  | 56 +---
  block/raw-posix.c  | 92 
 +++---
  block/raw-win32.c  |  6 +--
  blockdev.c | 30 +++
  include/qapi/util.h| 17 +
  qapi/Makefile.objs |  1 +
  qapi/block-core.json   | 17 +
  qapi/qapi-util.c   | 32 
  tests/qemu-iotests/049.out |  2 +-
  tests/qemu-iotests/082.out | 54 +--
  tests/qemu-iotests/096 | 64 
  tests/qemu-iotests/096.out | 14 +++
  tests/qemu-iotests/group   |  1 +
  13 files changed, 296 insertions(+), 90 deletions(-)
  create mode 100644 include/qapi/util.h
  create mode 100644 qapi/qapi-util.c
  create mode 100755 tests/qemu-iotests/096
  create mode 100644 tests/qemu-iotests/096.out
 
 -- 
 1.9.3

[Qemu-devel] [PATCH for-2.1 v3 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration

It fixes migration failure for machine type pc-i440fx-1.7 from
QEMU 1.7/2.0 to QEMU 2.1

Migration fails due to ACPI tables size grows across 1.7-2.1
versions. That causes ACPI tables ROM blob to change its size
differently for the same configurations on different QEMU versions.
As result migration code bails out when attempting to load
smaller ROM blob into a bigger one on a newer QEMU.
To trigger failure it's enough to add pci-bridge device to QEMU CLI

Marking ACPI tables ROM blob as extend-able on migration allows
QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing
forward migration failure introduced since 2.0 which affects
only configurations that cause ACPI ROM blob size change.

Signed-off-by: Igor Mammedov imamm...@redhat.com
Reviewed-by: Laszlo Ersek ler...@redhat.com
---
v3:
  fix build breakage of lm32 target
---
 hw/core/loader.c   | 6 +-
 hw/i386/acpi-build.c   | 2 +-
 hw/lm32/lm32_hwsetup.h | 2 +-
 include/hw/loader.h| 5 +++--
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 2bf6b8f..d09b894 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -722,7 +722,8 @@ err:
 
 void *rom_add_blob(const char *name, const void *blob, size_t len,
hwaddr addr, const char *fw_file_name,
-   FWCfgReadCallback fw_callback, void *callback_opaque)
+   FWCfgReadCallback fw_callback, void *callback_opaque,
+   bool extendable)
 {
 Rom *rom;
 void *data = NULL;
@@ -742,6 +743,9 @@ void *rom_add_blob(const char *name, const void *blob, 
size_t len,
 
 if (rom_file_has_mr) {
 data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
+if (extendable) {
+memory_region_permit_extendable_migration(rom-mr);
+}
 } else {
 data = rom-data;
 }
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ebc5f03..651c06b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1558,7 +1558,7 @@ static void *acpi_add_rom_blob(AcpiBuildState 
*build_state, GArray *blob,
const char *name)
 {
 return rom_add_blob(name, blob-data, acpi_data_len(blob), -1, name,
-acpi_build_update, build_state);
+acpi_build_update, build_state, true);
 }
 
 static const VMStateDescription vmstate_acpi_build = {
diff --git a/hw/lm32/lm32_hwsetup.h b/hw/lm32/lm32_hwsetup.h
index 9fd5e69..943130c 100644
--- a/hw/lm32/lm32_hwsetup.h
+++ b/hw/lm32/lm32_hwsetup.h
@@ -73,7 +73,7 @@ static inline void hwsetup_free(HWSetup *hw)
 static inline void hwsetup_create_rom(HWSetup *hw,
 hwaddr base)
 {
-rom_add_blob(hwsetup, hw-data, TARGET_PAGE_SIZE, base, NULL, NULL, 
NULL);
+rom_add_blob_fixed(hwsetup, hw-data, TARGET_PAGE_SIZE, base);
 }
 
 static inline void hwsetup_add_u8(HWSetup *hw, uint8_t u)
diff --git a/include/hw/loader.h b/include/hw/loader.h
index 796cbf9..e5a24ac 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -57,7 +57,8 @@ int rom_add_file(const char *file, const char *fw_dir,
  bool option_rom);
 void *rom_add_blob(const char *name, const void *blob, size_t len,
hwaddr addr, const char *fw_file_name,
-   FWCfgReadCallback fw_callback, void *callback_opaque);
+   FWCfgReadCallback fw_callback, void *callback_opaque,
+   bool extendable);
 int rom_add_elf_program(const char *name, void *data, size_t datasize,
 size_t romsize, hwaddr addr);
 int rom_load_all(void);
@@ -70,7 +71,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict);
 #define rom_add_file_fixed(_f, _a, _i)  \
 rom_add_file(_f, NULL, _a, _i, false)
 #define rom_add_blob_fixed(_f, _b, _l, _a)  \
-rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL)
+rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL, false)
 
 #define PC_ROM_MIN_VGA 0xc
 #define PC_ROM_MIN_OPTION  0xc8000
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 2/7] tests: Add virtio device initialization

On Fri, Jul 25, 2014 at 07:01:47PM +0200, Marc Marí wrote:
   @@ -73,3 +97,11 @@ QVirtioPCIDevice
   *qvirtio_pci_device_find(QPCIBus *bus, uint16_t device_type) 
return dev;
}
   +
   +void qvirtio_pci_enable_device(QVirtioPCIDevice *d)
   +{
   +qpci_device_enable(d-pdev);
   +d-addr = qpci_iomap(d-pdev, 0);
   +g_assert(d-addr != NULL);
   +}
  
  Where is qpci_iounmap() called to clean up?
 
 Missed. Also, it is unimplemented.

It would be much harder to add in the appropriate guest_free() calls
later so users should still call it.  Just like we should call
guest_free() even if it is currently unimplemented.

Stefan


pgpw3RjuSra6Y.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 1/7] bootindex: add modify_boot_device_path function

2014-07-28 Thread Gerd Hoffmann

  I think it is ok to allow only *changing* the bootindex.
  
 Yes, that's no problem.

But then yoy always  will have a old entry where you can take the suffix
from, and you don't need the suffix as parameter for the monitor
command.

cheers,
  Gerd

Re: [Qemu-devel] [RFC PATCH v2 09/49] pckbd: adding new fields to vmstate

Il 17/07/2014 13:02, Pavel Dovgalyuk ha scritto:
 This patch adds outport to VMState to allow correct saving and restoring
 the state of PC keyboard controller.
 
 Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru
 ---
  hw/input/pckbd.c |3 ++-
  1 files changed, 2 insertions(+), 1 deletions(-)
 
 diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
 index ca1cffc..19f6658 100644
 --- a/hw/input/pckbd.c
 +++ b/hw/input/pckbd.c
 @@ -371,13 +371,14 @@ static void kbd_reset(void *opaque)
  
  static const VMStateDescription vmstate_kbd = {
  .name = pckbd,
 -.version_id = 3,
 +.version_id = 4,
  .minimum_version_id = 3,
  .fields = (VMStateField[]) {
  VMSTATE_UINT8(write_cmd, KBDState),
  VMSTATE_UINT8(status, KBDState),
  VMSTATE_UINT8(mode, KBDState),
  VMSTATE_UINT8(pending, KBDState),
 +VMSTATE_UINT8_V(outport, KBDState, 4),
  VMSTATE_END_OF_LIST()
  }
  };
 
 
 

Again it would be nice to use a subsection.  You can use as the
default value

KBD_OUT_RESET | KBD_OUT_A20
 | (kbd-status  KBD_STAT_OBF ? KBD_OUT_OBF : 0)
 | (kbd-status  KBD_STAT_MOUSE_OBF ? KBD_OUT_MOUSE_OBF : 0)

If the value of outport matches this, you need not write it.  It's not
trivial, but you could do it like this:

- needed: return false if outport doesn't have the value above

- subsection post_load: set kbd-outport_present = 1

- device post_load: reconstruct outport if kbd-outport_present == 0

Paolo

Re: [Qemu-devel] [PATCH v2 1/7] bootindex: add modify_boot_device_path function

Hi,

 -Original Message-
 From: Gerd Hoffmann [mailto:kra...@redhat.com]
 Sent: Monday, July 28, 2014 5:28 PM
 Subject: Re: [PATCH v2 1/7] bootindex: add modify_boot_device_path function

   I think it is ok to allow only *changing* the bootindex.

  Yes, that's no problem.

 But then yoy always  will have a old entry where you can take the suffix
 from, and you don't need the suffix as parameter for the monitor
 command.

No, optional. 

Because the bootindex property is not a necessary property for devices.
If a device, such as IDE cdrom haven't attach the bootindex in qemu booting 
command line, the global fw_boot_order will not have its entry. So, we should
save the suffix as parameter.

Best regards,
-Gonglei

Re: [Qemu-devel] [RFC PATCH v2 10/49] rtl8139: adding new fields to vmstate

Il 17/07/2014 13:02, Pavel Dovgalyuk ha scritto:
 This patch adds virtual clock-dependent timers to VMState to allow correct
 saving and restoring the state of RTL8139 network controller.
 
 Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru
 ---
  hw/net/rtl8139.c |5 -
  1 files changed, 4 insertions(+), 1 deletions(-)
 
 diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
 index 90bc5ec..992caf0 100644
 --- a/hw/net/rtl8139.c
 +++ b/hw/net/rtl8139.c
 @@ -3289,7 +3289,7 @@ static void rtl8139_pre_save(void *opaque)
  
  static const VMStateDescription vmstate_rtl8139 = {
  .name = rtl8139,
 -.version_id = 4,
 +.version_id = 5,
  .minimum_version_id = 3,
  .post_load = rtl8139_post_load,
  .pre_save  = rtl8139_pre_save,
 @@ -3363,6 +3363,9 @@ static const VMStateDescription vmstate_rtl8139 = {
  VMSTATE_STRUCT(tally_counters, RTL8139State, 0,
 vmstate_tally_counters, RTL8139TallyCounters),
  
 +VMSTATE_TIMER_V(timer, RTL8139State, 5),

timer need not be migrated, because it is reinstated by rtl8139_post_load.

 +VMSTATE_INT64_V(TimerExpire, RTL8139State, 5),

This can be in a subsection, migrated only if non-zero.

Paolo

  VMSTATE_UINT32_V(cplus_enabled, RTL8139State, 4),
  VMSTATE_END_OF_LIST()
  },

Re: [Qemu-devel] [RFC PATCH v2 12/49] mc146818rtc: add missed field to vmstate

Il 17/07/2014 13:03, Pavel Dovgalyuk ha scritto:
 This patch adds irq_reinject_on_ack_count field to VMState to allow correct
 saving/loading the state of MC146818 RTC.
 
 Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru
 ---
  hw/timer/mc146818rtc.c |3 ++-
  1 files changed, 2 insertions(+), 1 deletions(-)
 
 diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
 index 9d817ca..c204abb 100644
 --- a/hw/timer/mc146818rtc.c
 +++ b/hw/timer/mc146818rtc.c
 @@ -735,7 +735,7 @@ static int rtc_post_load(void *opaque, int version_id)
  
  static const VMStateDescription vmstate_rtc = {
  .name = mc146818rtc,
 -.version_id = 3,
 +.version_id = 4,
  .minimum_version_id = 1,
  .post_load = rtc_post_load,
  .fields = (VMStateField[]) {
 @@ -752,6 +752,7 @@ static const VMStateDescription vmstate_rtc = {
  VMSTATE_INT64_V(offset, RTCState, 3),
  VMSTATE_TIMER_V(update_timer, RTCState, 3),
  VMSTATE_UINT64_V(next_alarm_time, RTCState, 3),
 +VMSTATE_UINT16_V(irq_reinject_on_ack_count, RTCState, 4),

Also can be a subsection, migrated only if nonzero.

Paolo

  VMSTATE_END_OF_LIST()
  }
  };

Re: [Qemu-devel] [RFC PATCH v2 04/49] fdc: adding vmstate for save/restore

Il 17/07/2014 13:02, Pavel Dovgalyuk ha scritto:
 VMState added by this patch preserves correct
 loading of the FDC device state.
 
 Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru
 ---
  hw/block/fdc.c |   11 +--
  1 files changed, 9 insertions(+), 2 deletions(-)
 
 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index 490d127..132310a 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c
 @@ -697,12 +697,17 @@ static const VMStateDescription 
 vmstate_fdrive_media_rate = {
  
  static const VMStateDescription vmstate_fdrive = {
  .name = fdrive,
 -.version_id = 1,
 +.version_id = 2,
  .minimum_version_id = 1,
  .fields = (VMStateField[]) {
  VMSTATE_UINT8(head, FDrive),
  VMSTATE_UINT8(track, FDrive),
  VMSTATE_UINT8(sect, FDrive),
 +VMSTATE_UINT8_V(last_sect, FDrive, 2),
 +VMSTATE_UINT8_V(max_track, FDrive, 2),
 +VMSTATE_UINT16_V(bps, FDrive, 2),
 +VMSTATE_UINT8_V(ro, FDrive, 2),
 +VMSTATE_UINT8_V(perpendicular, FDrive, 2),

Perpendicular can be added to a subsection, migrated only if nonzero.
The others can be reconstructed by calling fd_revalidate in
vmstate_fdrive's post_load callback.


  VMSTATE_END_OF_LIST()
  },
  .subsections = (VMStateSubsection[]) {
 @@ -736,7 +741,7 @@ static int fdc_post_load(void *opaque, int version_id)
  
  static const VMStateDescription vmstate_fdc = {
  .name = fdc,
 -.version_id = 2,
 +.version_id = 3,
  .minimum_version_id = 2,
  .pre_save = fdc_pre_save,
  .post_load = fdc_post_load,
 @@ -769,6 +774,8 @@ static const VMStateDescription vmstate_fdc = {
  VMSTATE_UINT8_EQUAL(num_floppies, FDCtrl),
  VMSTATE_STRUCT_ARRAY(drives, FDCtrl, MAX_FD, 1,
   vmstate_fdrive, FDrive),
 +VMSTATE_INT32_V(reset_sensei, FDCtrl, 3),

Subsection, only migrated if nonzero.

 +VMSTATE_TIMER_V(result_timer, FDCtrl, 3),

Subsection, only migrated if pending.

Paolo

  VMSTATE_END_OF_LIST()
  }
  };

Re: [Qemu-devel] Possible null-ptr dereference

On Mon, Jul 28, 2014 at 06:03:45AM +, Gonglei (Arei) wrote:
 Hi,
 
 Should be easy to fix though. Does the following help?
 
 (Cc'ing Stefan  Kevin)
 
 --
 xen_disk:  fix possible null-ptr dereference
 
 Signed-off-by: Gonglei arei.gong...@huawei.com
 ---
 hw/block/xen_disk.c | 1 +
 1 file changed, 1 insertion(+)

This code path can never be reached since protocol is always set to one
of 3 valid values in xen_disk.c.  Therefore, I'm not merging this for
QEMU 2.1 where we are only taking critical bug fixes now.

Still, it will help silence static checkers and make the intent clear to
readers.

Thanks, applied to my block-next tree for QEMU 2.2:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgpu0OfPHR42l.pgp
Description: PGP signature

Re: [Qemu-devel] Possible null-ptr dereference

 -Original Message-
 From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
 Sent: Monday, July 28, 2014 5:49 PM
 To: Gonglei (Arei)
 Cc: mateusz.krzywi...@windowslive.com; qemu-devel@nongnu.org;
 kw...@redhat.com
 Subject: Re: [Qemu-devel] Possible null-ptr dereference

 On Mon, Jul 28, 2014 at 06:03:45AM +, Gonglei (Arei) wrote:
  Hi,

  Should be easy to fix though. Does the following help?

  (Cc'ing Stefan  Kevin)

  --
  xen_disk:  fix possible null-ptr dereference

  Signed-off-by: Gonglei arei.gong...@huawei.com
  ---
  hw/block/xen_disk.c | 1 +
  1 file changed, 1 insertion(+)

 This code path can never be reached since protocol is always set to one
 of 3 valid values in xen_disk.c.  Therefore, I'm not merging this for
 QEMU 2.1 where we are only taking critical bug fixes now.

OK.

 Still, it will help silence static checkers and make the intent clear to
 readers.

 Thanks, applied to my block-next tree for QEMU 2.2:
 https://github.com/stefanha/qemu/commits/block-next

Thanks.

Best regards,
-Gonglei

Re: [Qemu-devel] [RFC PATCH v2 10/49] rtl8139: adding new fields to vmstate

2014-07-28 Thread Pavel Dovgaluk

 From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
 Bonzini
 Il 17/07/2014 13:02, Pavel Dovgalyuk ha scritto:
  This patch adds virtual clock-dependent timers to VMState to allow correct
  saving and restoring the state of RTL8139 network controller.
 
  Signed-off-by: Pavel Dovgalyuk pavel.dovga...@ispras.ru
  ---
   hw/net/rtl8139.c |5 -
   1 files changed, 4 insertions(+), 1 deletions(-)
 
  diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
  index 90bc5ec..992caf0 100644
  --- a/hw/net/rtl8139.c
  +++ b/hw/net/rtl8139.c
  @@ -3289,7 +3289,7 @@ static void rtl8139_pre_save(void *opaque)
 
   static const VMStateDescription vmstate_rtl8139 = {
   .name = rtl8139,
  -.version_id = 4,
  +.version_id = 5,
   .minimum_version_id = 3,
   .post_load = rtl8139_post_load,
   .pre_save  = rtl8139_pre_save,
  @@ -3363,6 +3363,9 @@ static const VMStateDescription vmstate_rtl8139 = {
   VMSTATE_STRUCT(tally_counters, RTL8139State, 0,
  vmstate_tally_counters, RTL8139TallyCounters),
 
  +VMSTATE_TIMER_V(timer, RTL8139State, 5),
 
 timer need not be migrated, because it is reinstated by rtl8139_post_load.
 

  That's true for normal execution.
  In replay execution mode post_load can be called before cached virtual clock
values are loaded. This may cause invalid setting of the timer and raising
an IRQ which didn't happen in record mode.
  I will update this patch to fix post_load function and avoid this 
non-deterministic behavior.

Pavel Dovgalyuk

Re: [Qemu-devel] [RFC PATCH v2 06/49] serial: fixing vmstate for save/restore

Il 17/07/2014 13:02, Pavel Dovgalyuk ha scritto:
 -.version_id = 3,
 +.version_id = 4,
  .minimum_version_id = 2,
  .pre_save = serial_pre_save,
  .post_load = serial_post_load,
  .fields = (VMStateField[]) {
  VMSTATE_UINT16_V(divider, SerialState, 2),
  VMSTATE_UINT8(rbr, SerialState),
 +VMSTATE_UINT8_V(thr, SerialState, 4),
 +VMSTATE_UINT8_V(tsr, SerialState, 4),
  VMSTATE_UINT8(ier, SerialState),
  VMSTATE_UINT8(iir, SerialState),
  VMSTATE_UINT8(lcr, SerialState),
 @@ -613,6 +627,15 @@ const VMStateDescription vmstate_serial = {
  VMSTATE_UINT8(msr, SerialState),
  VMSTATE_UINT8(scr, SerialState),
  VMSTATE_UINT8_V(fcr_vmstate, SerialState, 3),
 +VMSTATE_INT32_V(thr_ipending, SerialState, 4),

Subsection, only migrated if it doesn't match (s-iir  UART_IIR_ID) ==
UART_IIR_THRI.

 +VMSTATE_INT32_V(last_break_enable, SerialState, 4),

Can be reconstructed in the post_load callback from s-lcr.

 +VMSTATE_INT32_V(tsr_retry, SerialState, 4),

Subsection, only migrated if nonzero.  thr/tsr can be in this subsection
as well.

 +VMSTATE_STRUCT(recv_fifo, SerialState, 4, vmstate_fifo8, Fifo8),
 +VMSTATE_STRUCT(xmit_fifo, SerialState, 4, vmstate_fifo8, Fifo8),

Two subsections, only transmitted if nonempty.

 +VMSTATE_TIMER_V(fifo_timeout_timer, SerialState, 4),

Subsection, only transmitted if pending.

 +VMSTATE_INT32_V(timeout_ipending, SerialState, 4),

Subsection, transmitted only if nonzero.

 +VMSTATE_INT32_V(poll_msl, SerialState, 4),
 +VMSTATE_TIMER_V(modem_status_poll, SerialState, 4),

Both in a subsection, only migrated if poll_msl is not -1.

Paolo

  VMSTATE_END_OF_LIST()

Re: [Qemu-devel] Possible null-ptr dereference

2014-07-28 Thread Peter Maydell

On 28 July 2014 10:49, Stefan Hajnoczi stefa...@redhat.com wrote:
 On Mon, Jul 28, 2014 at 06:03:45AM +, Gonglei (Arei) wrote:
 Hi,

 Should be easy to fix though. Does the following help?

 (Cc'ing Stefan  Kevin)

 --
 xen_disk:  fix possible null-ptr dereference

 Signed-off-by: Gonglei arei.gong...@huawei.com
 ---
 hw/block/xen_disk.c | 1 +
 1 file changed, 1 insertion(+)

 This code path can never be reached since protocol is always set to one
 of 3 valid values in xen_disk.c.

Maybe g_assert_not_reached(); ?

thanks
-- PMM

Re: [Qemu-devel] [RFC PATCH v2 05/49] parallel: adding vmstate for save/restore

Il 17/07/2014 13:02, Pavel Dovgalyuk ha scritto:
 +.fields  = (VMStateField []) {
 +VMSTATE_UINT8(state.dataw, ISAParallelState),
 +VMSTATE_UINT8(state.datar, ISAParallelState),
 +VMSTATE_UINT8(state.status, ISAParallelState),
 +VMSTATE_UINT8(state.control, ISAParallelState),
 +VMSTATE_INT32(state.irq_pending, ISAParallelState),
 +VMSTATE_INT32(state.hw_driver, ISAParallelState),

Static, doesn't need migration.

 +VMSTATE_INT32(state.epp_timeout, ISAParallelState),
 +VMSTATE_INT32(state.it_shift, ISAParallelState),

Static, doesn't need migration.

 +VMSTATE_END_OF_LIST()

Re: [Qemu-devel] [PATCH v2 1/7] bootindex: add modify_boot_device_path function

2014-07-28 Thread Gerd Hoffmann

  Hi,

I think it is ok to allow only *changing* the bootindex.
   
   Yes, that's no problem.
  
  But then yoy always  will have a old entry where you can take the suffix
  from, and you don't need the suffix as parameter for the monitor
  command.
  
 No, optional. 

 Because the bootindex property is not a necessary property for devices.
 If a device, such as IDE cdrom haven't attach the bootindex in qemu booting 
 command line, the global fw_boot_order will not have its entry.

I'd suggest to simply not support this and throw an error then.

 So, we should
 save the suffix as parameter.

I think it is a bad idea to have the suffix as monitor command
parameter.  It is a implementation detail which the monitor users should
not have to worry about.

Easiest way to do this is to allow *changing* an existing bootindex
entry only, and not support *adding* new boot entries.  The user has to
set a bootindex for any device it might want to boot from in the future
then.  I think this is acceptable.

If you want support adding bootentries at runtime (and it probably makes
sense to support removing entries too in that case) the suffix handling
should be reworked.  The suffix could be stored in DeviceState instead
of being passed to the add_bootentry function, so you can add boot
entries later on without expecting the user to know what the correct
suffix is.

cheers,
  Gerd

Re: [Qemu-devel] [RFC PATCH v2 10/49] rtl8139: adding new fields to vmstate

Il 28/07/2014 11:54, Pavel Dovgaluk ha scritto:
   +VMSTATE_TIMER_V(timer, RTL8139State, 5),
  
  timer need not be migrated, because it is reinstated by rtl8139_post_load.
  
   That's true for normal execution.
   In replay execution mode post_load can be called before cached virtual clock
 values are loaded. This may cause invalid setting of the timer and raising
 an IRQ which didn't happen in record mode.
   I will update this patch to fix post_load function and avoid this 
 non-deterministic behavior.

This is what worries me of this series.  These invariants are not
documented anywhere, and people will break them unless you add
assertions that also hold in normal mode.

Paolo

Re: [Qemu-devel] [RFC PATCH v2 00/49] Series short description

Il 28/07/2014 09:50, Pavel Dovgaluk ha scritto:
 - patches 2 to 13 probably should try to use subsections, so that VMs
 that do not use the devices try not to save the extra data and keep
 backwards migration compatibility (at least try to)
 
  Could you give me and example?
  As I know, subsection is loaded when some predicate function returns true. 
 How can I 
 construct such a function for integratorcp module? What kind of condition 
 should it check?
  In this module I just added missed vmstates (it does not saved/restored at 
 all
 by the master version).

I tried to review the x86 ones.  These are the most interesting ones,
because it's the only board where we support cross-version migration.

Parallel is (quite obviously) unsalvageable, and I think HPET is too.

 - patch 16 should also use subsections, and perhaps apply to all other
 CPUs too?
 
  We implemented replay only for i386 and ARM. If we'll change other targets, 
 it will not
 add record/replay capabilities to them, but can confuse the reviewers.

Why are these fields only required by record/replay?

 - patches 23-24-25 perhaps could try using icount, like Konrad's patch do?
 
  Using faster icount (like in Konrad's patches) is the our next aim. It 
 obviously will 
 increase the speed of recording process. But now I submitted slower, but more 
 conservative
 version of icount which we had already tested.

I see.

 - patch 27 makes sense but VIRTUAL is used to skip blinking when the VM
 is stopped
 
  Right, this is kind of hack. I haven't found better solution yet.

I think it's okay to use REALTIME, just add runstate_is_running() to the
if (now = s-cursor_blink_time).

That said, every read to virtual clock is written to the
replay log worries me a bit from the point of view of thread-safety.
Do you need to log all reads because you don't use icount?  Reads can
only happen at given points if you use icount, and you could simply log
the icount value at each synchronization point.

Paolo

Re: [Qemu-devel] [PATCH v2 1/7] bootindex: add modify_boot_device_path function

 -Original Message-
 From: Gerd Hoffmann [mailto:kra...@redhat.com]
 Sent: Monday, July 28, 2014 6:02 PM
 Subject: Re: [PATCH v2 1/7] bootindex: add modify_boot_device_path function

   Hi,

 I think it is ok to allow only *changing* the bootindex.

Yes, that's no problem.

   But then yoy always  will have a old entry where you can take the suffix
   from, and you don't need the suffix as parameter for the monitor
   command.

  No, optional.

  Because the bootindex property is not a necessary property for devices.
  If a device, such as IDE cdrom haven't attach the bootindex in qemu booting
  command line, the global fw_boot_order will not have its entry.

 I'd suggest to simply not support this and throw an error then.

Ok.

  So, we should
  save the suffix as parameter.

 I think it is a bad idea to have the suffix as monitor command
 parameter.  It is a implementation detail which the monitor users should
 not have to worry about.

Yes. Actually I also have this misgivings.

 Easiest way to do this is to allow *changing* an existing bootindex
 entry only, and not support *adding* new boot entries.  The user has to
 set a bootindex for any device it might want to boot from in the future
 then.  I think this is acceptable.

Hmm..

 If you want support adding bootentries at runtime (and it probably makes
 sense to support removing entries too in that case) the suffix handling
 should be reworked.  The suffix could be stored in DeviceState instead
 of being passed to the add_bootentry function, so you can add boot
 entries later on without expecting the user to know what the correct
 suffix is.

That's a good idea! I think this is a good improvement. 

Any other comments? Thanks!

Best regards,
-Gonglei

Re: [Qemu-devel] [PATCH v2 7/7] libqos: Added basic virtqueue support to virtio implementation

2014-07-28 Thread Marc Marí

El Fri, 25 Jul 2014 23:37:43 +0200
Marc Marí marc.mari.barc...@gmail.com escribió:
 +
 +data = g_malloc0(512);
 +memread(r_req+16, data, 512);
 +g_assert_cmpstr(data, ==, TEST);
 +g_free(data);
 +

guest_free() for both requests should be added here. Will be added for
v3

 +/* End test */
 +guest_free(alloc, vq-desc);
  qvirtio_pci_disable_device(dev);
  g_free(dev);
  test_end();

Re: [Qemu-devel] [PULL for-2.1] Trivial patch for 2014-07-26

2014-07-28 Thread Peter Maydell

On 26 July 2014 08:18, Michael Tokarev m...@tls.msk.ru wrote:
 There's just one trivial patch this time, fixing another
 occurence of allows to in help text.  Please consider
 applying for 2.1.

 Thanks,

 /mjt

 The following changes since commit c60a57ff497667780132a3fcdc1500c83af5d5c0:

   Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into 
 staging (2014-07-25 16:58:41 +0100)

 are available in the git repository at:

   git://git.corpit.ru/qemu.git tags/trivial-patches-2014-07-26

 for you to fetch changes up to 2f47b403bd494b6717ef14c0d278d4b2d364b382:

   qemu-options: fix another allows-to for -net l2tpv3 (2014-07-26 11:16:44 
 +0400)

 
 trivial patches for 2014-07-26

 
 Michael Tokarev (1):
   qemu-options: fix another allows-to for -net l2tpv3

  qemu-options.hx |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v2 7/7] spapr: fix possible memory leak



On 25.07.14 08:52, arei.gong...@huawei.com wrote:

From: Gonglei arei.gong...@huawei.com

get_boot_devices_list() will malloc memory, spapr_finalize_fdt
doesn't free it.

Signed-off-by: Chenliang chenlian...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com


Thanks, applied to ppc-next-2.2.


Alex

[Qemu-devel] [PATCH] target-mips: Ignore unassigned accesses with KVM

2014-07-28 Thread James Hogan

MIPS registers an unassigned access handler which raises a guest bus
error exception. However this causes QEMU to crash when KVM is enabled
as it isn't called from the main execution loop so longjmp() gets called
without a corresponding setjmp().

Until the KVM API can be updated to trigger a guest exception in
response to an MMIO exit, prevent the bus error exception being raised
from mips_cpu_unassigned_access() if KVM is enabled.

The check is at run time since the do_unassigned_access callback is
initialised before it is known whether KVM will be enabled.

The problem can be triggered with Malta emulation by making the guest
write to the reset region at physical address 0x1bf0, since it is
marked read-only which is treated as unassigned for writes.

Signed-off-by: James Hogan james.ho...@imgtec.com
Cc: Aurelien Jarno aurel...@aurel32.net
Cc: Peter Maydell peter.mayd...@linaro.org
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Gleb Natapov g...@redhat.com
Cc: Christoffer Dall christoffer.d...@linaro.org
Cc: Sanjay Lal sanj...@kymasys.com
---
 target-mips/op_helper.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 27651a4a00c1..df97b35f8701 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -21,6 +21,7 @@
 #include qemu/host-utils.h
 #include exec/helper-proto.h
 #include exec/cpu_ldst.h
+#include sysemu/kvm.h
 
 #ifndef CONFIG_USER_ONLY
 static inline void cpu_mips_tlb_flush (CPUMIPSState *env, int flush_global);
@@ -2168,6 +2169,16 @@ void mips_cpu_unassigned_access(CPUState *cs, hwaddr 
addr,
 MIPSCPU *cpu = MIPS_CPU(cs);
 CPUMIPSState *env = cpu-env;
 
+/*
+ * Raising an exception with KVM enabled will crash because it won't be 
from
+ * the main execution loop so the longjmp won't have a matching setjmp.
+ * Until we can trigger a bus error exception through KVM lets just ignore
+ * the access.
+ */
+if (kvm_enabled()) {
+return;
+}
+
 if (is_exec) {
 helper_raise_exception(env, EXCP_IBE);
 } else {
-- 
1.8.5.5

Re: [Qemu-devel] [PATCH v2 for-2.1 2/2] pc: hack for migration compatibility from QEMU 2.0

Il 28/07/2014 13:45, Michael S. Tsirkin ha scritto:
 +/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
 + * -M pc-i440fx-2.0.
 
 Let's just say 2.0 and earlier?

This would give the idea that 1.6 is broken, but it isn't.

  Even if the actual amount of AML generated grows
 + * a little bit, there should be plenty of free space since the DSDT
 + * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1.
 + */
 +#define ACPI_BUILD_CPU_AML_SIZE97
 +#define ACPI_BUILD_BRIDGE_AML_SIZE 1875
 
 Let's put _LEGACY_ somewhere here?

Ok.

 +
 +#define ACPI_BUILD_TABLE_SIZE  0x1
 +
  typedef struct AcpiCpuInfo {
  DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
  } AcpiCpuInfo;
 @@ -87,6 +99,8 @@ typedef struct AcpiBuildPciBusHotplugState {
  struct AcpiBuildPciBusHotplugState *parent;
  } AcpiBuildPciBusHotplugState;
  
 +unsigned bsel_alloc;
 +
 
 Patch will be better contained if instead of using a global
 bsel_alloc, we actually go and count the devices that
 have ACPI_PCIHP_PROP_BSEL.
 You can just scan all devices, or all pci devices, it
 should not matter.
 This way, this code will be local to the legacy path.

Ok.

 
  static void acpi_get_dsdt(AcpiMiscInfo *info)
  {
  uint16_t *applesmc_sta;
 @@ -759,8 +773,8 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
  static void acpi_set_pci_info(void)
  {
  PCIBus *bus = find_i440fx(); /* TODO: Q35 support */
 -unsigned bsel_alloc = 0;
  
 +assert(bsel_alloc == 0);
  if (bus) {
  /* Scan all PCI buses. Set property to enable acpi based hotplug. */
  pci_for_each_bus_depth_first(bus, acpi_set_bsel, NULL, bsel_alloc);
 @@ -1440,13 +1454,14 @@ static
  void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
  {
  GArray *table_offsets;
 -unsigned facs, dsdt, rsdt;
 +unsigned facs, ssdt, dsdt, rsdt;
  AcpiCpuInfo cpu;
  AcpiPmInfo pm;
  AcpiMiscInfo misc;
  AcpiMcfgInfo mcfg;
  PcPciInfo pci;
  uint8_t *u;
 +size_t aml_len = 0;
  
  acpi_get_cpu_info(cpu);
  acpi_get_pm_info(pm);
 @@ -1474,13 +1489,20 @@ void acpi_build(PcGuestInfo *guest_info, 
 AcpiBuildTables *tables)
  dsdt = tables-table_data-len;
  build_dsdt(tables-table_data, tables-linker, misc);
  
 +/* Count the size of the DSDT and SSDT, we will need it for legacy
 + * sizing of ACPI tables.
 + */
 +aml_len += tables-table_data-len - dsdt;
 +
  /* ACPI tables pointed to by RSDT */
  acpi_add_table(table_offsets, tables-table_data);
  build_fadt(tables-table_data, tables-linker, pm, facs, dsdt);
  
 +ssdt = tables-table_data-len;
  acpi_add_table(table_offsets, tables-table_data);
  build_ssdt(tables-table_data, tables-linker, cpu, pm, misc, pci,
 guest_info);
 +aml_len += tables-table_data-len - ssdt;
  
  acpi_add_table(table_offsets, tables-table_data);
  build_madt(tables-table_data, tables-linker, cpu, guest_info);
 @@ -1513,12 +1535,53 @@ void acpi_build(PcGuestInfo *guest_info, 
 AcpiBuildTables *tables)
  /* RSDP is in FSEG memory, so allocate it separately */
  build_rsdp(tables-rsdp, tables-linker, rsdt);
  
 -/* We'll expose it all to Guest so align size to reduce
 +/* We'll expose it all to Guest so we want to reduce
   * chance of size changes.
   * RSDP is small so it's easy to keep it immutable, no need to
   * bother with alignment.
 + *
 + * We used to align the tables to 4k, but of course this would
 + * too simple to be enough.  4k turned out to be too small an
 + * alignment very soon, and in fact it is almost impossible to
 + * keep the table size stable for all (max_cpus, max_memory_slots)
 + * combinations.  So the table size is always 64k for pc-i440fx-2.1
 + * and we give an error if the table grows beyond that limit.
 + *
 + * We still have the problem of migrating from -M pc-i440fx-2.0.  For
 + * that, we exploit the fact that QEMU 2.1 generates _smaller_ tables
 + * than 2.0 and we can always pad the smaller tables with zeros.  We can
 + * then use the exact size of the 2.0 tables.
 + *
 + * All this is for PIIX4, since QEMU 2.0 didn't support Q35 migration.
   */
 -acpi_align_size(tables-table_data, 0x1000);
 +if (guest_info-legacy_acpi_table_size) {
 +/* Subtracting aml_len gives the size of fixed tables.  Then add the
 + * size of the PIIX4 DSDT/SSDT in QEMU 2.0.
 + */
 +int legacy_aml_len =
 +guest_info-legacy_acpi_table_size +
 +ACPI_BUILD_CPU_AML_SIZE * max_cpus +
 +ACPI_BUILD_BRIDGE_AML_SIZE * (MAX(bsel_alloc, 1) - 1);
 +int legacy_table_size =
 +ROUND_UP(tables-table_data-len - aml_len + legacy_aml_len, 
 0x1000);
 +if (tables-table_data-len  legacy_table_size) {
 +/* -M pc-i440fx-2.0 doesn't support memory hotplug, so this 
 should
 + * never happen.

Re: [Qemu-devel] [PATCH v12 1/4] spapr_pci: Make find_phb()/find_dev() public



On 16.07.14 02:20, Gavin Shan wrote:

From: Alexey Kardashevskiy a...@ozlabs.ru

This makes find_phb()/find_dev() public and changed its names
to spapr_pci_find_phb()/spapr_pci_find_dev() as they are going to
be used from other parts of QEMU such as VFIO DDW (dynamic DMA window)
or VFIO PCI error injection or VFIO EEH handling - in all these
cases there are RTAS calls which are addressed to BUID+config_addr
in IEEE1275 format.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru


Do we need this still?


Alex

Re: [Qemu-devel] [PATCH v12 3/4] headers: Update kernel header



On 16.07.14 03:40, Gavin Shan wrote:

On Wed, Jul 16, 2014 at 11:32:13AM +1000, Alexey Kardashevskiy wrote:

On 07/16/2014 11:16 AM, Gavin Shan wrote:

On Wed, Jul 16, 2014 at 11:09:44AM +1000, Alexey Kardashevskiy wrote:

On 07/16/2014 10:20 AM, Gavin Shan wrote:

This updates kernel header (vfio.h) for EEH support on VFIO PCI
devices.

Has this reached kernel upstream? The way linux headers update normally
happens is you have to run scripts/update-linux-headers.sh against some
linux kernel tag which you know that it won't change (like v3.16-rc5) and
post all the changes as a single patch. It is never a header update for a
specific feature, it is just an update.


The kernel part isn't merged yet. I guess that's for 3.17 merge window.
Ok, good to know scripts/update-linux-headers.sh. So this patch should
be dropped and some one run the script to update QEMU (linux-headers
directory) ?


Once your changes are in upstream kernel, you wait till kernel tree gets
new v3.xx-rcX tag, then you run the script and make a separate patch for
QEMU. Then you wait till it reaches QEMU upstream (because I do not know
who will pull it to what tree, look at git history) or ppc-next (if Alex
pulls it and you are basing your work on ppc-next) and then repost other
patches.


Thanks for detailed explaining, Alexey. I guess I have to suspend a bit
until v3.17.rc1 is coming out.


It's also perfectly fine to keep the kernel header update inside your 
patch set, but please make sure to mention which commit it is against. 
If I pick it up, I will generate the headers myself though and drop your 
patch while applying the rest of the set.



Alex

Re: [Qemu-devel] [PATCH] pty: Fix byte loss bug when connecting to pty

Il 28/07/2014 13:39, Sebastian Tanase ha scritto:
 When trying to print data to the pty, we first check if it is connected.
 If not, we try to reconnect, but we drop the pending data even if we
 have successfully reconnected; this makes us lose the first byte of the very
 first transmission.
 This small fix addresses the issue by checking once more if the pty is 
 connected
 after having tried to reconnect.
 
 Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr
 ---
 
 To reproduce the bug, launch a qemu image that has a parallel port (say lp0)
 and redirect it to a pty (-parallel pty). After the VM is launched,
 open the corresponding pty on your host (cat /dev/pts/X) and send some
 data from the VM to the host: echo abcd  /dev/lp0
 The first time, the received string will be bcd instead of abcd.
 This bug can have important consequences if you try, for example,
 to send a postscript file from a printer within the VM. Losing the
 first character will render the .ps file unusable.
 ---
  qemu-char.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)
 
 diff --git a/qemu-char.c b/qemu-char.c
 index 7acc03f..ce52d0f 100644
 --- a/qemu-char.c
 +++ b/qemu-char.c
 @@ -1160,7 +1160,9 @@ static int pty_chr_write(CharDriverState *chr, const 
 uint8_t *buf, int len)
  if (!s-connected) {
  /* guest sends data, check for (re-)connect */
  pty_chr_update_read_handler_locked(chr);
 -return 0;
 +if (!s-connected) {
 +return 0;
 +}
  }
  return io_channel_send(s-fd, buf, len);
  }
 

Looks ok, though only for 2.2 and 2.1.1.  Gerd, can you take care of
this patch?

Paolo

Re: [Qemu-devel] [PULL for-2.1 0/3] Last minute patches

2014-07-28 Thread Juan Quintela

Peter Maydell peter.mayd...@linaro.org wrote:
 On 25 July 2014 15:22, Paolo Bonzini pbonz...@redhat.com wrote:
 Since Igor hasn't sent his patches, and I'm leaving the office, I pushed
 this to

git://github.com/bonzini/qemu.git tags/for-upstream-full

 I don't know about tests/acpi-test-data/pc.  It makes sense that this
 patch should modify something there, but:

 * make check passes

 * the test warns even before patch 1, for both the DSDT (modified by the
 patch) and SSDT (which this series doesn't touch at all)

 * I cannot get it to pass, except by blindly copying the actual output
 on the expected files

 * mst is on vacation and Marcel is off on Fridays

 Based on my understanding of the problem, it is not possible to fix the
 bug without hacks like this one, and even reverting all patches in this
 area would be more risky.

 Hmm. I'm not really sure what the right thing is, so what
 I'm planning to do is:
  * just apply the qemu-char fix for now
  * not tag -rc4 today
  * see if things are clearer on Monday (I see Igor has now
sent out a patchset)
  * tag -rc4 Mon or Tues
  * slip the release date a few days (not a big deal, I think)

I am reading both patch-sets.

I preffer very much Paolo solution to Igor one.

But I have to say that I don't understand PATCH 1 (neither before or
after the change).  Solution does what we should do, that is generate
the size that destination is expecting, and no simply blindy accept
packages that are smaller.

The compatibility bits of PATCH2 look ok (that ones, I can kind of
understand them).


Igor?

Later, Juan.

Re: [Qemu-devel] [Bug 1343827] [NEW] block.c: multiwrite_merge() truncates overlapping requests

2014-07-28 Thread Andrey Korolyov

On Fri, Jul 18, 2014 at 10:15 AM, Slava Pestov
sviatoslav.pes...@gmail.com wrote:
 Public bug reported:

 If the list of requests passed to multiwrite_merge() contains two
 requests where the first is for a range of sectors that is a strict
 subset of the second's, the second request is truncated to end where the
 first starts, so the second half of the second request is lost.

 This is easy to reproduce by running fio against a virtio-blk device
 running on qemu 2.1.0-rc1 with the below fio script. At least with fio
 2.0.13, the randwrite pass will issue overlapping bios to the block
 driver, which the kernel is happy to pass along to qemu:

 [global]
 randrepeat=0
 ioengine=libaio
 iodepth=64
 direct=1
 size=1M
 numjobs=1
 verify_fatal=1
 verify_dump=1

 filename=$dev

 [seqwrite]
 blocksize_range=4k-1M
 rw=write
 verify=crc32c-intel

 [randwrite]
 stonewall
 blocksize_range=4k-1M
 rw=randwrite
 verify=meta

 Here is a naive fix for the problem that simply avoids merging
 problematic requests. I guess a better solution would be to redo
 qemu_iovec_concat() to do the right thing.

 diff -ur old/qemu-2.1.0-rc2/block.c qemu-2.1.0-rc2/block.c
 --- old/qemu-2.1.0-rc2/block.c  2014-07-15 14:49:14.0 -0700
 +++ qemu-2.1.0-rc2/block.c  2014-07-17 23:03:14.224169741 -0700
 @@ -4460,7 +4460,9 @@
  int64_t oldreq_last = reqs[outidx].sector + reqs[outidx].nb_sectors;

  // Handle exactly sequential writes and overlapping writes.
 -if (reqs[i].sector = oldreq_last) {
 +// If this request ends before the previous one, don't merge.
 +if (reqs[i].sector = oldreq_last 
 +reqs[i].sector + reqs[i].nb_sectors = oldreq_last) {
  merge = 1;
  }

 ** Affects: qemu
  Importance: Undecided
  Status: New

 --
 You received this bug notification because you are a member of qemu-
 devel-ml, which is subscribed to QEMU.
 https://bugs.launchpad.net/bugs/1343827

 Title:
   block.c: multiwrite_merge() truncates overlapping requests

 Status in QEMU:
   New

 Bug description:
   If the list of requests passed to multiwrite_merge() contains two
   requests where the first is for a range of sectors that is a strict
   subset of the second's, the second request is truncated to end where
   the first starts, so the second half of the second request is lost.

   This is easy to reproduce by running fio against a virtio-blk device
   running on qemu 2.1.0-rc1 with the below fio script. At least with fio
   2.0.13, the randwrite pass will issue overlapping bios to the block
   driver, which the kernel is happy to pass along to qemu:

   [global]
   randrepeat=0
   ioengine=libaio
   iodepth=64
   direct=1
   size=1M
   numjobs=1
   verify_fatal=1
   verify_dump=1

   filename=$dev

   [seqwrite]
   blocksize_range=4k-1M
   rw=write
   verify=crc32c-intel

   [randwrite]
   stonewall
   blocksize_range=4k-1M
   rw=randwrite
   verify=meta

   Here is a naive fix for the problem that simply avoids merging
   problematic requests. I guess a better solution would be to redo
   qemu_iovec_concat() to do the right thing.

   diff -ur old/qemu-2.1.0-rc2/block.c qemu-2.1.0-rc2/block.c
   --- old/qemu-2.1.0-rc2/block.c  2014-07-15 14:49:14.0 -0700
   +++ qemu-2.1.0-rc2/block.c  2014-07-17 23:03:14.224169741 -0700
   @@ -4460,7 +4460,9 @@
int64_t oldreq_last = reqs[outidx].sector + 
 reqs[outidx].nb_sectors;

// Handle exactly sequential writes and overlapping writes.
   -if (reqs[i].sector = oldreq_last) {
   +// If this request ends before the previous one, don't merge.
   +if (reqs[i].sector = oldreq_last 
   +reqs[i].sector + reqs[i].nb_sectors = oldreq_last) {
merge = 1;
}

 To manage notifications about this bug go to:
 https://bugs.launchpad.net/qemu/+bug/1343827/+subscriptions


Hello,

bug is still here in the master and can be easily reproduced (and, of
course, looks like blocker since data corruption takes place). Does
anyone have an idea on when the fix (at least suggested one) will be
merged?

Re: [Qemu-devel] [PULL for-2.1 0/3] Last minute patches

On Mon, 28 Jul 2014 14:59:26 +0200
Juan Quintela quint...@redhat.com wrote:

 Peter Maydell peter.mayd...@linaro.org wrote:
  On 25 July 2014 15:22, Paolo Bonzini pbonz...@redhat.com wrote:
  Since Igor hasn't sent his patches, and I'm leaving the office, I pushed
  this to
 
 git://github.com/bonzini/qemu.git tags/for-upstream-full
 
  I don't know about tests/acpi-test-data/pc.  It makes sense that this
  patch should modify something there, but:
 
  * make check passes
 
  * the test warns even before patch 1, for both the DSDT (modified by the
  patch) and SSDT (which this series doesn't touch at all)
 
  * I cannot get it to pass, except by blindly copying the actual output
  on the expected files
 
  * mst is on vacation and Marcel is off on Fridays
 
  Based on my understanding of the problem, it is not possible to fix the
  bug without hacks like this one, and even reverting all patches in this
  area would be more risky.
 
  Hmm. I'm not really sure what the right thing is, so what
  I'm planning to do is:
   * just apply the qemu-char fix for now
   * not tag -rc4 today
   * see if things are clearer on Monday (I see Igor has now
 sent out a patchset)
   * tag -rc4 Mon or Tues
   * slip the release date a few days (not a big deal, I think)
 
 I am reading both patch-sets.
 
 I preffer very much Paolo solution to Igor one.
 
 But I have to say that I don't understand PATCH 1 (neither before or
 after the change).  Solution does what we should do, that is generate
 the size that destination is expecting, and no simply blindy accept
 packages that are smaller.
 
 The compatibility bits of PATCH2 look ok (that ones, I can kind of
 understand them).

 
 Igor?
These patches work for 2.0-2.1 for -M pc-1.7 (Ubuntu case), however they 
doesn't
for 1.7-2.1 (RHEL7 case). 
Also chasing exact size is a bit problematic considering that different iasl
versions produce differently sized AML output.

As far as I understood from chat on #qemu,
Michael is going to pick-up this series + extendable RAMBlock series +
disabling PCI bridge hotplug patch and may be cook additional one to make 
working
backwards migration.

 
 Later, Juan.

Re: [Qemu-devel] [PATCH 0/4 v8] ppc: Add debug stub support



On 14.07.14 11:15, Bharat Bhushan wrote:

This patchset add support for
  - software breakpoint
  - h/w breakpoint
  - h/w watchpoint
   
Please find description in individual patch.


Thanks, applied (with minor edits in comments) to ppc-next-2.2.


Alex

Re: [Qemu-devel] [PATCH 3/3] ppc/spapr: Fix MAX_CPUS to 255



On 27.06.14 08:47, Nikunj A Dadhania wrote:

MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This
MAX_CPUS number was percolated back to virsh capabilities with wrong
max_cpus.

Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com


Nice one :). Thanks, applied to ppc-next-2.2.


Alex

Re: [Qemu-devel] [PATCH/RFC 0/5] s390x/kvm: track the logical cpu state in QEMU and propagate it to kvm



On 10.07.14 15:27, David Hildenbrand wrote:

This is the qemu part of kernel series Let user space control the
cpu states

Christian Borntraeger (1):
   update linux headers with with cpustate changes

David Hildenbrand (4):
   s390x/kvm: introduce proper states for s390 cpus
   s390x/kvm: proper use of the cpu states OPERATING and STOPPED
   s390x/kvm: test whether a cpu is STOPPED when checking has_work
   s390x/kvm: propagate s390 cpu state to kvm

  hw/s390x/ipl.c|   2 +-
  hw/s390x/s390-virtio.c|  32 --
  linux-headers/linux/kvm.h |   7 ++-
  target-s390x/cpu.c| 106 +++---
  target-s390x/cpu.h|  33 +--
  target-s390x/helper.c |  11 ++---
  target-s390x/kvm.c|  49 ++---
  trace-events  |   6 +++
  8 files changed, 179 insertions(+), 67 deletions(-)


Looks good to me

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


@all thought it was the final internal review :)


It's a perfectly good thing to say looks good to me in public too. The 
only major difference is that usually you would say reviewed-by ;).



Alex

Re: [Qemu-devel] [PATCH/RFC 0/5] s390x/kvm: track the logical cpu state in QEMU and propagate it to kvm


On 28.07.2014, at 15:43, Alexander Graf ag...@suse.de wrote:

 
 On 10.07.14 15:27, David Hildenbrand wrote:
 This is the qemu part of kernel series Let user space control the
 cpu states
 
 Christian Borntraeger (1):
   update linux headers with with cpustate changes
 
 David Hildenbrand (4):
   s390x/kvm: introduce proper states for s390 cpus
   s390x/kvm: proper use of the cpu states OPERATING and STOPPED
   s390x/kvm: test whether a cpu is STOPPED when checking has_work
   s390x/kvm: propagate s390 cpu state to kvm
 
  hw/s390x/ipl.c|   2 +-
  hw/s390x/s390-virtio.c|  32 --
  linux-headers/linux/kvm.h |   7 ++-
  target-s390x/cpu.c| 106 
 +++---
  target-s390x/cpu.h|  33 +--
  target-s390x/helper.c |  11 ++---
  target-s390x/kvm.c|  49 ++---
  trace-events  |   6 +++
  8 files changed, 179 insertions(+), 67 deletions(-)
 
 Looks good to me
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 @all thought it was the final internal review :)
 
 It's a perfectly good thing to say looks good to me in public too. The only 
 major difference is that usually you would say reviewed-by ;).

Meh - only realized after I sent this that all those patches are From: you. 
Then of course it's not useful ;)


Alex

Re: [Qemu-devel] [PATCH/RFC 4/5] s390x/kvm: test whether a cpu is STOPPED when checking has_work



On 10.07.14 15:10, Christian Borntraeger wrote:

From: David Hildenbrand d...@linux.vnet.ibm.com

If a cpu is stopped, it must never be allowed to run and no interrupt may wake 
it
up. A cpu also has to be unhalted if it is halted and has work to do - this
scenario wasn't hit in kvm case yet, as only disabled wait is processed within
QEMU.

Signed-off-by: David Hildenbrand d...@linux.vnet.ibm.com
Reviewed-by: Cornelia Huck cornelia.h...@de.ibm.com
Reviewed-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com


This looks like it's something that generic infrastructure should take 
care of, no? How does this work for the other archs? They always get an 
interrupt on the transition between !has_work - has_work. Why don't we 
get one for s390x?



Alex

Re: [Qemu-devel] [PATCH v12 1/4] spapr_pci: Make find_phb()/find_dev() public

2014-07-28 Thread Alexey Kardashevskiy

On 07/28/2014 10:49 PM, Alexander Graf wrote:
 
 On 16.07.14 02:20, Gavin Shan wrote:
 From: Alexey Kardashevskiy a...@ozlabs.ru

 This makes find_phb()/find_dev() public and changed its names
 to spapr_pci_find_phb()/spapr_pci_find_dev() as they are going to
 be used from other parts of QEMU such as VFIO DDW (dynamic DMA window)
 or VFIO PCI error injection or VFIO EEH handling - in all these
 cases there are RTAS calls which are addressed to BUID+config_addr
 in IEEE1275 format.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 
 Do we need this still?


Ufff. I missed when this patchet stopped needing this patch :)
Looks to me that this patchset does not need it so I'll repost it for DDW
later.


-- 
Alexey

[Qemu-devel] [PATCH v3 0/5] ACPI fixes for QEMU 2.1

v2-v3:
fix tests/acpi-test-data/pc/DSDT [Peter]
track down make check failure, fix it [patch 4, me]
split patch 2 in two parts [mst]
do not make bsel_alloc global [mst]
include Igor's bridge patch [mst, as discussed on IRC]

Igor Mammedov (1):
  pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is
disabled

Paolo Bonzini (4):
  acpi-dsdt: procedurally generate _PRT
  pc: hack for migration compatibility from QEMU 2.0
  pc: future-proof migration-compatibility of ACPI tables
  bios-tables-test: fix ASL normalization false positive

 hw/i386/acpi-build.c|  119 ++-
 hw/i386/acpi-dsdt.dsl   |   90 +-
 hw/i386/acpi-dsdt.hex.generated | 1910 +++
 hw/i386/pc_piix.c   |   19 +
 hw/i386/pc_q35.c|5 +
 include/hw/i386/pc.h|1 +
 tests/acpi-test-data/pc/DSDT|  Bin 4499 - 2807 bytes
 tests/bios-tables-test.c|6 +-
 8 files changed, 288 insertions(+), 1862 deletions(-)

-- 
1.8.3.1

[Qemu-devel] [PATCH 1/5] acpi-dsdt: procedurally generate _PRT

This replaces the _PRT constant with a method that computes it.

The problem is that the DSDT+SSDT have grown from 2.0 to 2.1,
enough to cross the 8k barrier (we align the ACPI tables to 4k
before putting them in fw_cfg).  This causes problems with
migration and the pc-i440fx-2.0 machine type.

The solution to the problem is to hardcode 64k as the limit,
but this doesn't solve the bug with pc-i440fx-2.0.  The fix will be
for QEMU 2.1 to use exactly the same size as QEMU 2.0 for the
ACPI tables.  First, however, we must make the actual AML
equal or smaller; to do this, rewrite _PRT in a way that saves
over 1k of bytecode.

Reviewed-by: Laszlo Ersek ler...@redhat.com
Tested-by: Igor Mammedov imamm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/i386/acpi-dsdt.dsl   |   90 +-
 hw/i386/acpi-dsdt.hex.generated | 1910 +++
 tests/acpi-test-data/pc/DSDT|  Bin 4499 - 2807 bytes
 3 files changed, 148 insertions(+), 1852 deletions(-)

diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl
index 3cc0ea0..6ba0170 100644
--- a/hw/i386/acpi-dsdt.dsl
+++ b/hw/i386/acpi-dsdt.dsl
@@ -181,57 +181,45 @@ DefinitionBlock (
 
 Scope(\_SB) {
 Scope(PCI0) {
-Name(_PRT, Package() {
-/* PCI IRQ routing table, example from ACPI 2.0a specification,
-   section 6.2.8.1 */
-/* Note: we provide the same info as the PCI routing
-   table of the Bochs BIOS */
-
-#define prt_slot(nr, lnk0, lnk1, lnk2, lnk3) \
-Package() { nr##, 0, lnk0, 0 }, \
-Package() { nr##, 1, lnk1, 0 }, \
-Package() { nr##, 2, lnk2, 0 }, \
-Package() { nr##, 3, lnk3, 0 }
-
-#define prt_slot0(nr) prt_slot(nr, LNKD, LNKA, LNKB, LNKC)
-#define prt_slot1(nr) prt_slot(nr, LNKA, LNKB, LNKC, LNKD)
-#define prt_slot2(nr) prt_slot(nr, LNKB, LNKC, LNKD, LNKA)
-#define prt_slot3(nr) prt_slot(nr, LNKC, LNKD, LNKA, LNKB)
-
-prt_slot0(0x),
-/* Device 1 is power mgmt device, and can only use irq 9 */
-prt_slot(0x0001, LNKS, LNKB, LNKC, LNKD),
-prt_slot2(0x0002),
-prt_slot3(0x0003),
-prt_slot0(0x0004),
-prt_slot1(0x0005),
-prt_slot2(0x0006),
-prt_slot3(0x0007),
-prt_slot0(0x0008),
-prt_slot1(0x0009),
-prt_slot2(0x000a),
-prt_slot3(0x000b),
-prt_slot0(0x000c),
-prt_slot1(0x000d),
-prt_slot2(0x000e),
-prt_slot3(0x000f),
-prt_slot0(0x0010),
-prt_slot1(0x0011),
-prt_slot2(0x0012),
-prt_slot3(0x0013),
-prt_slot0(0x0014),
-prt_slot1(0x0015),
-prt_slot2(0x0016),
-prt_slot3(0x0017),
-prt_slot0(0x0018),
-prt_slot1(0x0019),
-prt_slot2(0x001a),
-prt_slot3(0x001b),
-prt_slot0(0x001c),
-prt_slot1(0x001d),
-prt_slot2(0x001e),
-prt_slot3(0x001f),
-})
+Method (_PRT, 0) {
+Store(Package(128) {}, Local0)
+Store(Zero, Local1)
+While(LLess(Local1, 128)) {
+// slot = pin  2
+Store(ShiftRight(Local1, 2), Local2)
+
+// lnk = (slot + pin)  3
+Store(And(Add(Local1, Local2), 3), Local3)
+If (LEqual(Local3, 0)) {
+Store(Package(4) { Zero, Zero, LNKD, Zero }, Local4)
+}
+If (LEqual(Local3, 1)) {
+// device 1 is the power-management device, needs SCI
+If (LEqual(Local1, 4)) {
+Store(Package(4) { Zero, Zero, LNKS, Zero }, 
Local4)
+} Else {
+Store(Package(4) { Zero, Zero, LNKA, Zero }, 
Local4)
+}
+}
+If (LEqual(Local3, 2)) {
+Store(Package(4) { Zero, Zero, LNKB, Zero }, Local4)
+}
+If (LEqual(Local3, 3)) {
+Store(Package(4) { Zero, Zero, LNKC, Zero }, Local4)
+}
+
+// Complete the interrupt routing entry:
+//Package(4) { 0x[slot], [pin], [link], 0) }
+
+Store(Or(ShiftLeft(Local2, 16), 0x), Index(Local4, 0))
+Store(And(Local1, 3),Index(Local4, 1))
+Store(Local4,Index(Local0, 
Local1))
+
+Increment(Local1)
+}
+
+Return(Local0)
+}
 }

[Qemu-devel] [PATCH 2/5] pc: hack for migration compatibility from QEMU 2.0

Changing the ACPI table size causes migration to break, and the memory
hotplug work opened our eyes on how horribly we were breaking things in
2.0 already.

The ACPI table size is rounded to the next 4k, which one would think
gives some headroom.  In practice this is not the case, because the user
can control the ACPI table size (each CPU adds 97 bytes to the SSDT and
8 to the MADT) and so some -smp values will break the 4k boundary and
fail to migrate.  Similarly, PCI bridges add ~1870 bytes to the SSDT.

This patch concerns itself with fixing migration from QEMU 2.0.  It
computes the payload size of QEMU 2.0 and always uses that one.
The previous patch shrunk the ACPI tables enough that the QEMU 2.0 size
should always be enough; non-AML tables can change depending on the
configuration (especially MADT, SRAT, HPET) but they remain the same
between QEMU 2.0 and 2.1, so we only compute our padding based on the
sizes of the SSDT and DSDT.

Migration from QEMU 1.7 should work for guests that have a number of CPUs
other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140.  It was already
broken from QEMU 1.7 to QEMU 2.0 in the same way, though.

The amount of AML for a bridge varies a little bit between 1872 and 1875
due to optimized number encodings.  Use the smallest value, and let any
extra chew away at the slack left by the shrinking of the DSDT.

Reviewed-by: Laszlo Ersek ler...@redhat.com
Tested-by: Igor Mammedov imamm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/i386/acpi-build.c | 84 +---
 hw/i386/pc_piix.c| 19 
 hw/i386/pc_q35.c |  5 
 include/hw/i386/pc.h |  1 +
 4 files changed, 105 insertions(+), 4 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ebc5f03..7d2251f 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -25,7 +25,9 @@
 #include glib.h
 #include qemu-common.h
 #include qemu/bitmap.h
+#include qemu/osdep.h
 #include qemu/range.h
+#include qemu/error-report.h
 #include hw/pci/pci.h
 #include qom/cpu.h
 #include hw/i386/pc.h
@@ -52,6 +54,15 @@
 #include qapi/qmp/qint.h
 #include qom/qom-qobject.h
 
+/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
+ * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
+ * a little bit, there should be plenty of free space since the DSDT
+ * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1.
+ */
+#define ACPI_BUILD_LEGACY_CPU_AML_SIZE97
+#define ACPI_BUILD_LEGACY_BRIDGE_AML_SIZE 1872
+#define ACPI_BUILD_ALIGN_SIZE 0x1000
+
 typedef struct AcpiCpuInfo {
 DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
 } AcpiCpuInfo;
@@ -737,6 +748,27 @@ static void patch_pciqxl(int slot, uint8_t *ssdt_ptr)
 ssdt_ptr[ACPI_PCIQXL_OFFSET_ADR + 2] = slot;
 }
 
+static void *pci_for_each_bus_incr_func(PCIBus *bus, void *opaque)
+{
+unsigned *bsel_alloc = opaque;
+
+if (bus-qbus.allow_hotplug) {
+(*bsel_alloc)++;
+}
+
+return bsel_alloc;
+}
+
+static unsigned acpi_count_hotpluggable_pci_buses(void)
+{
+PCIBus *bus = find_i440fx(); /* TODO: Q35 support */
+unsigned bus_count = 0;
+
+pci_for_each_bus_depth_first(bus, pci_for_each_bus_incr_func,
+ NULL, bus_count);
+return bus_count;
+}
+
 /* Assign BSEL property to all buses.  In the future, this can be changed
  * to only assign to buses that support hotplug.
  */
@@ -1440,13 +1472,14 @@ static
 void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
 {
 GArray *table_offsets;
-unsigned facs, dsdt, rsdt;
+unsigned facs, ssdt, dsdt, rsdt;
 AcpiCpuInfo cpu;
 AcpiPmInfo pm;
 AcpiMiscInfo misc;
 AcpiMcfgInfo mcfg;
 PcPciInfo pci;
 uint8_t *u;
+size_t aml_len = 0;
 
 acpi_get_cpu_info(cpu);
 acpi_get_pm_info(pm);
@@ -1474,13 +1507,20 @@ void acpi_build(PcGuestInfo *guest_info, 
AcpiBuildTables *tables)
 dsdt = tables-table_data-len;
 build_dsdt(tables-table_data, tables-linker, misc);
 
+/* Count the size of the DSDT and SSDT, we will need it for legacy
+ * sizing of ACPI tables.
+ */
+aml_len += tables-table_data-len - dsdt;
+
 /* ACPI tables pointed to by RSDT */
 acpi_add_table(table_offsets, tables-table_data);
 build_fadt(tables-table_data, tables-linker, pm, facs, dsdt);
 
+ssdt = tables-table_data-len;
 acpi_add_table(table_offsets, tables-table_data);
 build_ssdt(tables-table_data, tables-linker, cpu, pm, misc, pci,
guest_info);
+aml_len += tables-table_data-len - ssdt;
 
 acpi_add_table(table_offsets, tables-table_data);
 build_madt(tables-table_data, tables-linker, cpu, guest_info);
@@ -1513,14 +1553,50 @@ void acpi_build(PcGuestInfo *guest_info, 
AcpiBuildTables *tables)
 /* RSDP is in FSEG memory, so allocate it separately */
 build_rsdp(tables-rsdp, tables-linker, rsdt);
 
-/* We'll expose it all to Guest so align size

[Qemu-devel] [PATCH 3/5] pc: future-proof migration-compatibility of ACPI tables

This patch avoids that similar changes break QEMU again in the future.
QEMU will now hard-code 64k as the maximum ACPI table size, which
(despite being an order of magnitude smaller than 640k) should be enough
for everyone.

Reviewed-by: Laszlo Ersek ler...@redhat.com
Tested-by: Igor Mammedov imamm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/i386/acpi-build.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 7d2251f..8d42eaf 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -63,6 +63,8 @@
 #define ACPI_BUILD_LEGACY_BRIDGE_AML_SIZE 1875
 #define ACPI_BUILD_ALIGN_SIZE 0x1000
 
+#define ACPI_BUILD_TABLE_SIZE 0x1
+
 typedef struct AcpiCpuInfo {
 DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
 } AcpiCpuInfo;
@@ -1593,7 +1595,13 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 }
 g_array_set_size(tables-table_data, legacy_table_size);
 } else {
-acpi_align_size(tables-table_data, ACPI_BUILD_ALIGN_SIZE);
+if (tables-table_data-len  ACPI_BUILD_TABLE_SIZE) {
+/* As of QEMU 2.1, this fires with 160 VCPUs and 255 memory slots. 
 */
+error_report(ACPI tables are larger than 64k.  Please remove);
+error_report(CPUs, NUMA nodes, memory slots or PCI bridges.);
+exit(1);
+}
+g_array_set_size(tables-table_data, ACPI_BUILD_TABLE_SIZE);
 }
 
 acpi_align_size(tables-linker, ACPI_BUILD_ALIGN_SIZE);
-- 
1.8.3.1

[Qemu-devel] [PATCH 4/5] bios-tables-test: fix ASL normalization false positive

My version of IASL (from RHEL7) puts two newlines between the head comment
and the DefinitionBlock property.  One was already removed because the
test uses sizeof instead of strlen, but the extra one breaks the detection
of DefinitionBlock.  Killing all newlines after the comment drops the
warning.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 tests/bios-tables-test.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
index 62771f7..045eb27 100644
--- a/tests/bios-tables-test.c
+++ b/tests/bios-tables-test.c
@@ -487,7 +487,11 @@ static GString *normalize_asl(gchar *asl_code)
 /* strip comments (different generation days) */
 comment = g_strstr_len(asl-str, asl-len, COMMENT_END);
 if (comment) {
-asl = g_string_erase(asl, 0, comment + sizeof(COMMENT_END) - asl-str);
+comment += strlen(COMMENT_END);
+while (*comment == '\n') {
+comment++;
+}
+asl = g_string_erase(asl, 0, comment - asl-str);
 }
 
 /* strip def block name (it has file path in it) */
-- 
1.8.3.1

[Qemu-devel] [PATCH 5/5] pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is disabled

From: Igor Mammedov imamm...@redhat.com

Fixes migration regression from QEMU-1.7 to a newer QEMUs.
SSDT table size in QEMU-1.7 doesn't change regardless of
a number of PCI bridge devices present at startup.

However in QEMU-2.0 since addition of hotplug on PCI bridges,
each PCI bridge adds ~1875 bytes to SSDT table, including
pc-i440fx-1.7 machine type where PCI bridge hotplug disabled
via compat property.
It breaks migration from QEMU-1.7 to QEMU-2.[01] -M pc-i440fx-1.7
since RAMBlock size of ACPI tables on target becomes larger
then on source and migration fails with:

Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000

error.

Fix this by generating AML only for PCI0 bus if
hotplug on PCI bridges is disabled and preserves PCI brigde
description in AML as it was done in QEMU-1.7 for pc-i440fx-1.7.

It will help to maintain size of SSDT static regardless of
number of PCI bridges on startup for pc-i440fx-1.7 machine type.

Signed-off-by: Igor Mammedov imamm...@redhat.com
[Affect bus_count too. - Paolo]
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/i386/acpi-build.c | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 8d42eaf..90f1b90 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -77,6 +77,7 @@ typedef struct AcpiMcfgInfo {
 typedef struct AcpiPmInfo {
 bool s3_disabled;
 bool s4_disabled;
+bool pcihp_bridge_en;
 uint8_t s4_val;
 uint16_t sci_int;
 uint8_t acpi_enable_cmd;
@@ -98,6 +99,7 @@ typedef struct AcpiBuildPciBusHotplugState {
 GArray *device_table;
 GArray *notify_table;
 struct AcpiBuildPciBusHotplugState *parent;
+bool pcihp_bridge_en;
 } AcpiBuildPciBusHotplugState;
 
 static void acpi_get_dsdt(AcpiMiscInfo *info)
@@ -201,6 +203,9 @@ static void acpi_get_pm_info(AcpiPmInfo *pm)
NULL);
 pm-gpe0_blk_len = object_property_get_int(obj, ACPI_PM_PROP_GPE0_BLK_LEN,
NULL);
+pm-pcihp_bridge_en =
+object_property_get_bool(obj, acpi-pci-hotplug-with-bridge-support,
+ NULL);
 }
 
 static void acpi_get_misc_info(AcpiMiscInfo *info)
@@ -802,11 +807,13 @@ static void acpi_set_pci_info(void)
 }
 
 static void build_pci_bus_state_init(AcpiBuildPciBusHotplugState *state,
- AcpiBuildPciBusHotplugState *parent)
+ AcpiBuildPciBusHotplugState *parent,
+ bool pcihp_bridge_en)
 {
 state-parent = parent;
 state-device_table = build_alloc_array();
 state-notify_table = build_alloc_array();
+state-pcihp_bridge_en = pcihp_bridge_en;
 }
 
 static void build_pci_bus_state_cleanup(AcpiBuildPciBusHotplugState *state)
@@ -820,7 +827,7 @@ static void *build_pci_bus_begin(PCIBus *bus, void 
*parent_state)
 AcpiBuildPciBusHotplugState *parent = parent_state;
 AcpiBuildPciBusHotplugState *child = g_malloc(sizeof *child);
 
-build_pci_bus_state_init(child, parent);
+build_pci_bus_state_init(child, parent, parent-pcihp_bridge_en);
 
 return child;
 }
@@ -841,6 +848,14 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
 GArray *method;
 bool bus_hotplug_support = false;
 
+/*
+skip bridge subtree creation if bridge hotplug is disabled
+to make it compatible with 1.7 machine type
+*/
+if (!child-pcihp_bridge_en  bus-parent_dev) {
+return;
+}
+
 if (bus-parent_dev) {
 op = 0x82; /* DeviceOp */
 build_append_nameseg(bus_table, S%.02X_,
@@ -887,7 +902,8 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
 pc = PCI_DEVICE_GET_CLASS(pdev);
 dc = DEVICE_GET_CLASS(pdev);
 
-if (pc-class_id == PCI_CLASS_BRIDGE_ISA || pc-is_bridge) {
+if (pc-class_id == PCI_CLASS_BRIDGE_ISA ||
+(pc-is_bridge  child-pcihp_bridge_en)) {
 set_bit(slot, slot_device_system);
 }
 
@@ -899,7 +915,7 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
 }
 }
 
-if (!dc-hotpluggable || pc-is_bridge) {
+if (!dc-hotpluggable || (pc-is_bridge  child-pcihp_bridge_en)) {
 clear_bit(slot, slot_hotplug_enable);
 }
 }
@@ -1164,7 +1180,7 @@ build_ssdt(GArray *table_data, GArray *linker,
 bus = PCI_HOST_BRIDGE(pci_host)-bus;
 }
 
-build_pci_bus_state_init(hotplug_state, NULL);
+build_pci_bus_state_init(hotplug_state, NULL, 
pm-pcihp_bridge_en);
 
 if (bus) {
 /* Scan all PCI buses. Generate tables to support hotplug. */
@@ -1578,7 +1594,8 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 /* Subtracting aml_len gives the size of fixed tables.  Then add the
  * size of the PIIX4 DSDT/SSDT in QEMU 2.0.

Re: [Qemu-devel] [PATCH] e1000: Delay LSC until mask is active



On 03.07.14 22:18, Michael S. Tsirkin wrote:

On Thu, Jul 03, 2014 at 08:00:04PM +0200, Alexander Graf wrote:

On 03.07.14 19:57, Peter Maydell wrote:

On 3 July 2014 18:39, Alexander Graf ag...@suse.de wrote:

Mac OS X reads ICR on every interrupt. When the IRQ line is shared, this may
result in a race where LSC is not interpreted yet, but already gets cleared.

The guest already has a way of telling us that it can interpret LSC events
though and that's via the interrupt mask register (IMS).

So if we just leave the LSC interrupt bit pending, but invisible to the guest
as long as it's not ready to receive LSC interrupts, we basically defer the
interrupt to the earliest point in time when the guest would know how to
handle it.

This would break any guests dealing with this in a polling
mode (ie permanently leave interrupts masked and read
ICR periodically to find out whether anything interesting
has happened), right?

If those guests would wait for a link detect event that way, yes.

Considering all the hackery we already have about link negotiation (delay it
until a random amount of ms passed) I'd say the breakage this patch fixes is
a lot more likely than a polling guest that waits for a link based on
ICR.LSC :).


Alex

Well that hackery was justified by the claim that real hardware behaves
this way: it has a random delay since it needs a bit of time to bring
the link up.

What's the justification here? How come this driver works with
real hardware?


Real hardware never shares interrupts? ;)


Alex

Re: [Qemu-devel] [PATCH/RFC 4/5] s390x/kvm: test whether a cpu is STOPPED when checking has_work

2014-07-28 Thread David Hildenbrand

 
 On 10.07.14 15:10, Christian Borntraeger wrote:
  From: David Hildenbrand d...@linux.vnet.ibm.com
 
  If a cpu is stopped, it must never be allowed to run and no interrupt may 
  wake it
  up. A cpu also has to be unhalted if it is halted and has work to do - this
  scenario wasn't hit in kvm case yet, as only disabled wait is processed 
  within
  QEMU.
 
  Signed-off-by: David Hildenbrand d...@linux.vnet.ibm.com
  Reviewed-by: Cornelia Huck cornelia.h...@de.ibm.com
  Reviewed-by: Christian Borntraeger borntrae...@de.ibm.com
  Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
 
 This looks like it's something that generic infrastructure should take 
 care of, no? How does this work for the other archs? They always get an 
 interrupt on the transition between !has_work - has_work. Why don't we 
 get one for s390x?
 
 
 Alex
 
 

Well, we have the special case on s390 as a CPU that is in the STOPPED or the
CHECK STOP state may never run - even if there is an interrupt. It's
basically like this CPU has been switched off.

Imagine that it is tried to inject an interrupt into a stopped vcpu. It
will kick the stopped vcpu and thus lead to a call to
kvm_arch_process_async_events(). We have to deny that this vcpu will ever
run as long as it is stopped. It's like a way to suppress the
interrupt for such a transition you mentioned.

Later, another vcpu might decide to turn that vcpu back on (by e.g. sending a
SIGP START to that vcpu).

I am not sure if such a mechanism/scenario is applicable to any other arch. They
all seem to reset the cs-halted flag if they know they are able to run (e.g.
due to an interrupt) - they have no such thing as stopped cpus, only
halted/waiting cpus.

David

Re: [Qemu-devel] [PATCH/RFC 4/5] s390x/kvm: test whether a cpu is STOPPED when checking has_work

Il 28/07/2014 16:16, David Hildenbrand ha scritto:
 Later, another vcpu might decide to turn that vcpu back on (by e.g. sending a
 SIGP START to that vcpu).
 
 I am not sure if such a mechanism/scenario is applicable to any other arch. 
 They
 all seem to reset the cs-halted flag if they know they are able to run (e.g.
 due to an interrupt) - they have no such thing as stopped cpus, only
 halted/waiting cpus.

On x86, INIT_RECEIVED is pretty much a stopped CPU.  It can only run
(and receive interrupts) after getting a special startup interrupt (SIPI).

Paolo

Re: [Qemu-devel] [PATCH/RFC 4/5] s390x/kvm: test whether a cpu is STOPPED when checking has_work

On 28.07.2014, at 16:16, David Hildenbrand d...@linux.vnet.ibm.com wrote:

On 10.07.14 15:10, Christian Borntraeger wrote:
From: David Hildenbrand d...@linux.vnet.ibm.com

If a cpu is stopped, it must never be allowed to run and no interrupt may
wake it
up. A cpu also has to be unhalted if it is halted and has work to do - this
scenario wasn't hit in kvm case yet, as only disabled wait is processed
within
QEMU.

Signed-off-by: David Hildenbrand d...@linux.vnet.ibm.com
Reviewed-by: Cornelia Huck cornelia.h...@de.ibm.com
Reviewed-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com

This looks like it's something that generic infrastructure should take
care of, no? How does this work for the other archs? They always get an
interrupt on the transition between !has_work - has_work. Why don't we
get one for s390x?

Alex

Well, we have the special case on s390 as a CPU that is in the STOPPED or the
CHECK STOP state may never run - even if there is an interrupt. It's
basically like this CPU has been switched off.

Imagine that it is tried to inject an interrupt into a stopped vcpu. It
will kick the stopped vcpu and thus lead to a call to
kvm_arch_process_async_events(). We have to deny that this vcpu will ever
run as long as it is stopped. It's like a way to suppress the
interrupt for such a transition you mentioned.

An interrupt kick usually just means we go back into the main loop. From there
we check the interrupt bitmap which interrupt to handle. Check out the handling
code here:

http://git.qemu.org/?p=qemu.git;a=blob;f=cpu-exec.c;h=38e5f02a307523d99134f4e2e6c51683bb10b45b;hb=HEAD#l580

If you just check for the stopped state in here, do_interrupt() will never get
called and thus the CPU shouldn't ever get executed. Unless I'm heavily
mistaken :).

Later, another vcpu might decide to turn that vcpu back on (by e.g. sending a
SIGP START to that vcpu).

Yes, in that case that other CPU generates a signal (a different bit in
interrupt_request) and the first CPU would see that it has to wake up and wake
up.

I am not sure if such a mechanism/scenario is applicable to any other arch.
They
all seem to reset the cs-halted flag if they know they are able to run (e.g.
due to an interrupt) - they have no such thing as stopped cpus, only
halted/waiting cpus.

There's not really much difference between the two. The only difference from a
software point of view is that a stopped CPU has its external interrupt bits
masked off, no?

Alex

Re: [Qemu-devel] [PATCH/RFC 4/5] s390x/kvm: test whether a cpu is STOPPED when checking has_work

2014-07-28 Thread David Hildenbrand

On 28.07.2014, at 16:16, David Hildenbrand d...@linux.vnet.ibm.com wrote:

On 10.07.14 15:10, Christian Borntraeger wrote:
From: David Hildenbrand d...@linux.vnet.ibm.com

If a cpu is stopped, it must never be allowed to run and no interrupt may
wake it
up. A cpu also has to be unhalted if it is halted and has work to do -
this
scenario wasn't hit in kvm case yet, as only disabled wait is processed
within
QEMU.

Alex

Well, we have the special case on s390 as a CPU that is in the STOPPED or
the
CHECK STOP state may never run - even if there is an interrupt. It's
basically like this CPU has been switched off.

An interrupt kick usually just means we go back into the main loop. From
there we check the interrupt bitmap which interrupt to handle. Check out the
handling code here:

http://git.qemu.org/?p=qemu.git;a=blob;f=cpu-exec.c;h=38e5f02a307523d99134f4e2e6c51683bb10b45b;hb=HEAD#l580

If you just check for the stopped state in here, do_interrupt() will never
get called and thus the CPU shouldn't ever get executed. Unless I'm heavily
mistaken :).

So you would rather move the check out of has_work() into the main loop in
cpu-exec.c and directly into kvm_arch_process_async_events()?

This would on the other hand lead to an unhalt of the vcpu in cpu_exec() on any
CPU_INTERRUPT_HARD. A VCPU might thus be unhalted although it is not able to
run. Is okay?

Looking at cpu.c:cpu_thread_is_idle(), we would maybe return false, although we
are idle (because we are idle when we are stopped)?

My qemu kvm knowledge is way better than the qemu emulation knowledge, so I
appreciate any insights :)

Later, another vcpu might decide to turn that vcpu back on (by e.g. sending
a
SIGP START to that vcpu).

Yes, in that case that other CPU generates a signal (a different bit in
interrupt_request) and the first CPU would see that it has to wake up and
wake up.

I am not sure if such a mechanism/scenario is applicable to any other arch.
They
all seem to reset the cs-halted flag if they know they are able to run
(e.g.
due to an interrupt) - they have no such thing as stopped cpus, only
halted/waiting cpus.

There's not really much difference between the two. The only difference from
a software point of view is that a stopped CPU has its external interrupt
bits masked off, no?

Well the difference is, that a STOPPED vcpu can be woken up by non-interrupt
like things (SIGP START) AND a special interrupt (SIGP RESTART - which is like
a SIPI++ as it performs a psw exchange - NMI). So we basically have two
paths that can lead to a state change. All interrupt bits may be in any
combination (SIGP RESTART interrupts can't be masked out, nor can SIGP START be
denied).

The other thing may be that on s390, each vcpu (including itself) can put
another vcpu into the STOPPED state - I assume that this is different for x86
INIT_RECEIVED. For this reason we have to watch out for bad race conditions
(e.g. multiple vcpus working on another vcpu)...

David

Alex

Re: [Qemu-devel] [PATCH 2/5] pc: hack for migration compatibility from QEMU 2.0

On Mon, Jul 28, 2014 at 04:02:12PM +0200, Paolo Bonzini wrote:
 Changing the ACPI table size causes migration to break, and the memory
 hotplug work opened our eyes on how horribly we were breaking things in
 2.0 already.
 
 The ACPI table size is rounded to the next 4k, which one would think
 gives some headroom.  In practice this is not the case, because the user
 can control the ACPI table size (each CPU adds 97 bytes to the SSDT and
 8 to the MADT) and so some -smp values will break the 4k boundary and
 fail to migrate.  Similarly, PCI bridges add ~1870 bytes to the SSDT.
 
 This patch concerns itself with fixing migration from QEMU 2.0.  It
 computes the payload size of QEMU 2.0 and always uses that one.
 The previous patch shrunk the ACPI tables enough that the QEMU 2.0 size
 should always be enough; non-AML tables can change depending on the
 configuration (especially MADT, SRAT, HPET) but they remain the same
 between QEMU 2.0 and 2.1, so we only compute our padding based on the
 sizes of the SSDT and DSDT.
 
 Migration from QEMU 1.7 should work for guests that have a number of CPUs
 other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140.  It was already
 broken from QEMU 1.7 to QEMU 2.0 in the same way, though.
 
 The amount of AML for a bridge varies a little bit between 1872 and 1875
 due to optimized number encodings.  Use the smallest value, and let any
 extra chew away at the slack left by the shrinking of the DSDT.
 
 Reviewed-by: Laszlo Ersek ler...@redhat.com
 Tested-by: Igor Mammedov imamm...@redhat.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/i386/acpi-build.c | 84 
 +---
  hw/i386/pc_piix.c| 19 
  hw/i386/pc_q35.c |  5 
  include/hw/i386/pc.h |  1 +
  4 files changed, 105 insertions(+), 4 deletions(-)
 
 diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
 index ebc5f03..7d2251f 100644
 --- a/hw/i386/acpi-build.c
 +++ b/hw/i386/acpi-build.c
 @@ -25,7 +25,9 @@
  #include glib.h
  #include qemu-common.h
  #include qemu/bitmap.h
 +#include qemu/osdep.h
  #include qemu/range.h
 +#include qemu/error-report.h
  #include hw/pci/pci.h
  #include qom/cpu.h
  #include hw/i386/pc.h
 @@ -52,6 +54,15 @@
  #include qapi/qmp/qint.h
  #include qom/qom-qobject.h
  
 +/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
 + * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
 + * a little bit, there should be plenty of free space since the DSDT
 + * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1.
 + */
 +#define ACPI_BUILD_LEGACY_CPU_AML_SIZE97
 +#define ACPI_BUILD_LEGACY_BRIDGE_AML_SIZE 1872


Hmm this is wrong if some slot is occupied by a
non hotpluggable device, is it not?

 +#define ACPI_BUILD_ALIGN_SIZE 0x1000
 +
  typedef struct AcpiCpuInfo {
  DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
  } AcpiCpuInfo;
 @@ -737,6 +748,27 @@ static void patch_pciqxl(int slot, uint8_t *ssdt_ptr)
  ssdt_ptr[ACPI_PCIQXL_OFFSET_ADR + 2] = slot;
  }
  
 +static void *pci_for_each_bus_incr_func(PCIBus *bus, void *opaque)
 +{
 +unsigned *bsel_alloc = opaque;
 +
 +if (bus-qbus.allow_hotplug) {
 +(*bsel_alloc)++;
 +}

Hmm I don't know why we do allow_hotplug in the
place where you copied this from :(

 +
 +return bsel_alloc;
 +}
 +
 +static unsigned acpi_count_hotpluggable_pci_buses(void)
 +{
 +PCIBus *bus = find_i440fx(); /* TODO: Q35 support */
 +unsigned bus_count = 0;
 +
 +pci_for_each_bus_depth_first(bus, pci_for_each_bus_incr_func,
 + NULL, bus_count);
 +return bus_count;
 +}
 +
  /* Assign BSEL property to all buses.  In the future, this can be changed
   * to only assign to buses that support hotplug.
   */
 @@ -1440,13 +1472,14 @@ static
  void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
  {
  GArray *table_offsets;
 -unsigned facs, dsdt, rsdt;
 +unsigned facs, ssdt, dsdt, rsdt;
  AcpiCpuInfo cpu;
  AcpiPmInfo pm;
  AcpiMiscInfo misc;
  AcpiMcfgInfo mcfg;
  PcPciInfo pci;
  uint8_t *u;
 +size_t aml_len = 0;
  
  acpi_get_cpu_info(cpu);
  acpi_get_pm_info(pm);
 @@ -1474,13 +1507,20 @@ void acpi_build(PcGuestInfo *guest_info, 
 AcpiBuildTables *tables)
  dsdt = tables-table_data-len;
  build_dsdt(tables-table_data, tables-linker, misc);
  
 +/* Count the size of the DSDT and SSDT, we will need it for legacy
 + * sizing of ACPI tables.
 + */
 +aml_len += tables-table_data-len - dsdt;
 +
  /* ACPI tables pointed to by RSDT */
  acpi_add_table(table_offsets, tables-table_data);
  build_fadt(tables-table_data, tables-linker, pm, facs, dsdt);
  
 +ssdt = tables-table_data-len;
  acpi_add_table(table_offsets, tables-table_data);
  build_ssdt(tables-table_data, tables-linker, cpu, pm, misc, pci,
 guest_info);
 +aml_len += tables-table_data-len - ssdt;

Re: [Qemu-devel] [PATCH 2/5] pc: hack for migration compatibility from QEMU 2.0

Il 28/07/2014 17:05, Michael S. Tsirkin ha scritto:
  +/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
  + * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
  + * a little bit, there should be plenty of free space since the DSDT
  + * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1.
  + */
  +#define ACPI_BUILD_LEGACY_CPU_AML_SIZE97
  +#define ACPI_BUILD_LEGACY_BRIDGE_AML_SIZE 1872
 
 Hmm this is wrong if some slot is occupied by a
 non hotpluggable device, is it not?

I honestly have no idea.  If even more complicated code is needed, I'd
rather get rid of all this bridge stuff altogether (as far as computing
the AML size goes), and declare 2.0-2.1 migration broken if the VM has
bridges.

Paolo

[Qemu-devel] [PATCH 1/5] scsi-bus: prepare scsi_req_new for introduction of parse_cdb

The per-SCSIDevice parse_cdb callback must not be called if the
request will go through special SCSIReqOps, so detect the special
cases early enough.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/scsi-bus.c | 51 ++-
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 4341754..ca4e9f3 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -561,13 +561,38 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
   uint8_t *buf, void *hba_private)
 {
 SCSIBus *bus = DO_UPCAST(SCSIBus, qbus, d-qdev.parent_bus);
+const SCSIReqOps *ops;
 SCSIRequest *req;
-SCSICommand cmd;
+SCSICommand cmd = { .len = 0 };
+int ret;
+
+if ((d-unit_attention.key == UNIT_ATTENTION ||
+ bus-unit_attention.key == UNIT_ATTENTION) 
+(buf[0] != INQUIRY 
+ buf[0] != REPORT_LUNS 
+ buf[0] != GET_CONFIGURATION 
+ buf[0] != GET_EVENT_STATUS_NOTIFICATION 
+
+ /*
+  * If we already have a pending unit attention condition,
+  * report this one before triggering another one.
+  */
+ !(buf[0] == REQUEST_SENSE  d-sense_is_ua))) {
+ops = reqops_unit_attention;
+} else if (lun != d-lun ||
+   buf[0] == REPORT_LUNS ||
+   (buf[0] == REQUEST_SENSE  d-sense_len)) {
+ops = reqops_target_command;
+} else {
+ops = NULL;
+}
 
-if (scsi_req_parse(cmd, d, buf) != 0) {
+ret = scsi_req_parse(cmd, d, buf);
+if (ret != 0) {
 trace_scsi_req_parse_bad(d-id, lun, tag, buf[0]);
 req = scsi_req_alloc(reqops_invalid_opcode, d, tag, lun, hba_private);
 } else {
+assert(cmd.len != 0);
 trace_scsi_req_parsed(d-id, lun, tag, buf[0],
   cmd.mode, cmd.xfer);
 if (cmd.lba != -1) {
@@ -577,25 +602,8 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
 
 if (cmd.xfer  INT32_MAX) {
 req = scsi_req_alloc(reqops_invalid_field, d, tag, lun, 
hba_private);
-} else if ((d-unit_attention.key == UNIT_ATTENTION ||
-   bus-unit_attention.key == UNIT_ATTENTION) 
-  (buf[0] != INQUIRY 
-   buf[0] != REPORT_LUNS 
-   buf[0] != GET_CONFIGURATION 
-   buf[0] != GET_EVENT_STATUS_NOTIFICATION 
-
-   /*
-* If we already have a pending unit attention condition,
-* report this one before triggering another one.
-*/
-   !(buf[0] == REQUEST_SENSE  d-sense_is_ua))) {
-req = scsi_req_alloc(reqops_unit_attention, d, tag, lun,
- hba_private);
-} else if (lun != d-lun ||
-   buf[0] == REPORT_LUNS ||
-   (buf[0] == REQUEST_SENSE  d-sense_len)) {
-req = scsi_req_alloc(reqops_target_command, d, tag, lun,
- hba_private);
+} else if (ops) {
+req = scsi_req_alloc(ops, d, tag, lun, hba_private);
 } else {
 req = scsi_device_alloc_req(d, tag, lun, buf, hba_private);
 }
@@ -1186,6 +1194,7 @@ static int scsi_req_parse(SCSICommand *cmd, SCSIDevice 
*dev, uint8_t *buf)
 {
 int rc;
 
+cmd-lba = -1;
 switch (buf[0]  5) {
 case 0:
 cmd-len = 6;
-- 
1.9.3

[Qemu-devel] [PATCH for-2.2 v2 0/5] scsi: enable passthrough of vendor-specific commands

Right now scsi-generic is parsing the CDB, in order to compute
the expected number of bytes to be transferred.  This is necessary
if DMA is done by the HBA via scsi_req_data, but it prevents executing
vendor-specific commands via scsi-generic because we don't know how
to parse them.

If DMA is delegated to the SCSI layer via get_sg_list, we know in
advance how many bytes the guest will want to receive and we can pass
the information straight from the guest to SG_IO.  In this case, it is
unnecessary to parse the CDB to get the same information.  scsi-disk needs
it to detect underruns and overruns, but scsi-generic and scsi-block can
just ask the HBA about the transfer direction and size.

This series introduces a new parse_cdb callback in both the device and
the HBA.  The latter is called by scsi_bus_parse_cdb, which devices can
call for passthrough requests in their implementation of parse_cdb.

Paolo

v1-v2: use the right CDB size for non-vendor-specific commands,
as some drivers and/or firmware expect that and complain
if you pass a READ(10) command in a 16-byte CDB.  Interdiff
here.

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index f999bfa..6f4462b 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -57,12 +57,14 @@ int scsi_bus_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, 
uint8_t *buf,
void *hba_private)
 {
 SCSIBus *bus = DO_UPCAST(SCSIBus, qbus, dev-qdev.parent_bus);
+int rc;
 
+assert(cmd-len == 0);
+rc = scsi_req_parse_cdb(dev, cmd, buf);
 if (bus-info-parse_cdb) {
-return bus-info-parse_cdb(dev, cmd, buf, hba_private);
-} else {
-return scsi_req_parse_cdb(dev, cmd, buf);
+rc = bus-info-parse_cdb(dev, cmd, buf, hba_private);
 }
+return rc;
 }
 
 static SCSIRequest *scsi_device_alloc_req(SCSIDevice *s, uint32_t tag, 
uint32_t lun,
@@ -575,7 +577,7 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
 const SCSIReqOps *ops;
 SCSIDeviceClass *sc = SCSI_DEVICE_GET_CLASS(d);
 SCSIRequest *req;
-SCSICommand cmd;
+SCSICommand cmd = { .len = 0 };
 int ret;
 
 if ((d-unit_attention.key == UNIT_ATTENTION ||
@@ -609,6 +611,7 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
 trace_scsi_req_parse_bad(d-id, lun, tag, buf[0]);
 req = scsi_req_alloc(reqops_invalid_opcode, d, tag, lun, hba_private);
 } else {
+assert(cmd.len != 0);
 trace_scsi_req_parsed(d-id, lun, tag, buf[0],
   cmd.mode, cmd.xfer);
 if (cmd.lba != -1) {
@@ -1210,6 +1213,7 @@ int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, 
uint8_t *buf)
 {
 int rc;
 
+cmd-lba = -1;
 switch (buf[0]  5) {
 case 0:
 cmd-len = 6;
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index e8f4c0c..2dd9255 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -411,9 +411,10 @@ static int virtio_scsi_parse_cdb(SCSIDevice *dev, 
SCSICommand *cmd,
 {
 VirtIOSCSIReq *req = hba_private;
 
-cmd-lba = -1;
-cmd-len = MIN(VIRTIO_SCSI_CDB_SIZE, SCSI_CMD_BUF_SIZE);
-memcpy(cmd-buf, buf, cmd-len);
+if (cmd-len == 0) {
+cmd-len = MIN(VIRTIO_SCSI_CDB_SIZE, SCSI_CMD_BUF_SIZE);
+memcpy(cmd-buf, buf, cmd-len);
+}
 
 /* Extract the direction and mode directly from the request, for
  * host device passthrough.

Paolo Bonzini (5):
  scsi-bus: prepare scsi_req_new for introduction of parse_cdb
  scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
  scsi-block: extract scsi_block_is_passthrough
  scsi-block, scsi-generic: implement parse_cdb
  virtio-scsi: implement parse_cdb

 hw/scsi/scsi-bus.c | 74 ++
 hw/scsi/scsi-disk.c| 52 +++
 hw/scsi/scsi-generic.c |  7 +
 hw/scsi/virtio-scsi.c  | 25 +
 include/hw/scsi/scsi.h |  7 +
 5 files changed, 130 insertions(+), 35 deletions(-)

-- 
1.9.3

[Qemu-devel] [PATCH 3/5] scsi-block: extract scsi_block_is_passthrough

This will be used for both scsi_block_new_request and the scsi-block
implementation of parse_cdb.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/scsi-disk.c | 38 ++
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index d47ecd6..81b7276 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2501,12 +2501,8 @@ static int scsi_block_initfn(SCSIDevice *dev)
 return scsi_initfn(s-qdev);
 }
 
-static SCSIRequest *scsi_block_new_request(SCSIDevice *d, uint32_t tag,
-   uint32_t lun, uint8_t *buf,
-   void *hba_private)
+static bool scsi_block_is_passthrough(SCSIDiskState *s, uint8_t *buf)
 {
-SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, d);
-
 switch (buf[0]) {
 case READ_6:
 case READ_10:
@@ -2523,9 +2519,9 @@ static SCSIRequest *scsi_block_new_request(SCSIDevice *d, 
uint32_t tag,
 case WRITE_VERIFY_12:
 case WRITE_VERIFY_16:
 /* If we are not using O_DIRECT, we might read stale data from the
-* host cache if writes were made using other commands than these
-* ones (such as WRITE SAME or EXTENDED COPY, etc.).  So, without
-* O_DIRECT everything must go through SG_IO.
+ * host cache if writes were made using other commands than these
+ * ones (such as WRITE SAME or EXTENDED COPY, etc.).  So, without
+ * O_DIRECT everything must go through SG_IO.
  */
 if (!(bdrv_get_flags(s-qdev.conf.bs)  BDRV_O_NOCACHE)) {
 break;
@@ -2542,13 +2538,31 @@ static SCSIRequest *scsi_block_new_request(SCSIDevice 
*d, uint32_t tag,
  * just make scsi-block operate the same as scsi-generic for them.
  */
 if (s-qdev.type != TYPE_ROM) {
-return scsi_req_alloc(scsi_disk_dma_reqops, s-qdev, tag, lun,
-  hba_private);
+return false;
 }
+break;
+
+default:
+break;
 }
 
-return scsi_req_alloc(scsi_generic_req_ops, s-qdev, tag, lun,
-  hba_private);
+return true;
+}
+
+
+static SCSIRequest *scsi_block_new_request(SCSIDevice *d, uint32_t tag,
+   uint32_t lun, uint8_t *buf,
+   void *hba_private)
+{
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, d);
+
+if (scsi_block_is_passthrough(s, buf)) {
+return scsi_req_alloc(scsi_generic_req_ops, s-qdev, tag, lun,
+  hba_private);
+} else {
+return scsi_req_alloc(scsi_disk_dma_reqops, s-qdev, tag, lun,
+  hba_private);
+}
 }
 #endif
 
-- 
1.9.3

[Qemu-devel] [PATCH 4/5] scsi-block, scsi-generic: implement parse_cdb

The callback lets the bus provide the direction and transfer count
for passthrough commands, enabling passthrough of vendor-specific
commands.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/scsi-bus.c |  3 +--
 hw/scsi/scsi-disk.c| 14 ++
 hw/scsi/scsi-generic.c |  7 +++
 include/hw/scsi/scsi.h |  1 +
 4 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index d97d18a..6f4462b 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -9,7 +9,6 @@
 
 static char *scsibus_get_dev_path(DeviceState *dev);
 static char *scsibus_get_fw_dev_path(DeviceState *dev);
-static int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf);
 static void scsi_req_dequeue(SCSIRequest *req);
 static uint8_t *scsi_target_alloc_buf(SCSIRequest *req, size_t len);
 static void scsi_target_free_buf(SCSIRequest *req);
@@ -1210,7 +1209,7 @@ static uint64_t scsi_cmd_lba(SCSICommand *cmd)
 return lba;
 }
 
-static int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf)
+int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf)
 {
 int rc;
 
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 81b7276..d55521d 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2564,6 +2564,19 @@ static SCSIRequest *scsi_block_new_request(SCSIDevice 
*d, uint32_t tag,
   hba_private);
 }
 }
+
+static int scsi_block_parse_cdb(SCSIDevice *d, SCSICommand *cmd,
+  uint8_t *buf, void *hba_private)
+{
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, d);
+
+if (scsi_block_is_passthrough(s, buf)) {
+return scsi_bus_parse_cdb(s-qdev, cmd, buf, hba_private);
+} else {
+return scsi_req_parse_cdb(s-qdev, cmd, buf);
+}
+}
+
 #endif
 
 #define DEFINE_SCSI_DISK_PROPERTIES()\
@@ -2672,6 +2685,7 @@ static void scsi_block_class_initfn(ObjectClass *klass, 
void *data)
 sc-init = scsi_block_initfn;
 sc-destroy  = scsi_destroy;
 sc-alloc_req= scsi_block_new_request;
+sc-parse_cdb= scsi_block_parse_cdb;
 dc-fw_name = disk;
 dc-desc = SCSI block device passthrough;
 dc-reset = scsi_disk_reset;
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 3733d2c..0b2ff90 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -490,6 +490,12 @@ static Property scsi_generic_properties[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+static int scsi_generic_parse_cdb(SCSIDevice *dev, SCSICommand *cmd,
+  uint8_t *buf, void *hba_private)
+{
+return scsi_bus_parse_cdb(dev, cmd, buf, hba_private);
+}
+
 static void scsi_generic_class_initfn(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -498,6 +504,7 @@ static void scsi_generic_class_initfn(ObjectClass *klass, 
void *data)
 sc-init = scsi_generic_initfn;
 sc-destroy  = scsi_destroy;
 sc-alloc_req= scsi_new_request;
+sc-parse_cdb= scsi_generic_parse_cdb;
 dc-fw_name = disk;
 dc-desc = pass through generic scsi device (/dev/sg*);
 dc-reset = scsi_generic_reset;
diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h
index 4a0b860..a7a28e6 100644
--- a/include/hw/scsi/scsi.h
+++ b/include/hw/scsi/scsi.h
@@ -250,6 +250,7 @@ void scsi_req_unref(SCSIRequest *req);
 
 int scsi_bus_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
void *hba_private);
+int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf);
 void scsi_req_build_sense(SCSIRequest *req, SCSISense sense);
 void scsi_req_print(SCSIRequest *req);
 void scsi_req_continue(SCSIRequest *req);
-- 
1.9.3

[Qemu-devel] [PATCH 2/5] scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo

These callbacks will let devices do their own request parsing, or
defer it to the bus.  If the bus does not provide an implementation,
in turn, fall back to the default parsing routine.

Swap the first two arguments to scsi_req_parse, and rename it to
scsi_req_parse_cdb, for consistency.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/scsi-bus.c | 26 +++---
 include/hw/scsi/scsi.h |  6 ++
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index ca4e9f3..d97d18a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -9,7 +9,7 @@
 
 static char *scsibus_get_dev_path(DeviceState *dev);
 static char *scsibus_get_fw_dev_path(DeviceState *dev);
-static int scsi_req_parse(SCSICommand *cmd, SCSIDevice *dev, uint8_t *buf);
+static int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf);
 static void scsi_req_dequeue(SCSIRequest *req);
 static uint8_t *scsi_target_alloc_buf(SCSIRequest *req, size_t len);
 static void scsi_target_free_buf(SCSIRequest *req);
@@ -54,6 +54,20 @@ static void scsi_device_destroy(SCSIDevice *s)
 }
 }
 
+int scsi_bus_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
+   void *hba_private)
+{
+SCSIBus *bus = DO_UPCAST(SCSIBus, qbus, dev-qdev.parent_bus);
+int rc;
+
+assert(cmd-len == 0);
+rc = scsi_req_parse_cdb(dev, cmd, buf);
+if (bus-info-parse_cdb) {
+rc = bus-info-parse_cdb(dev, cmd, buf, hba_private);
+}
+return rc;
+}
+
 static SCSIRequest *scsi_device_alloc_req(SCSIDevice *s, uint32_t tag, 
uint32_t lun,
   uint8_t *buf, void *hba_private)
 {
@@ -562,6 +576,7 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
 {
 SCSIBus *bus = DO_UPCAST(SCSIBus, qbus, d-qdev.parent_bus);
 const SCSIReqOps *ops;
+SCSIDeviceClass *sc = SCSI_DEVICE_GET_CLASS(d);
 SCSIRequest *req;
 SCSICommand cmd = { .len = 0 };
 int ret;
@@ -587,7 +602,12 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
 ops = NULL;
 }
 
-ret = scsi_req_parse(cmd, d, buf);
+if (ops != NULL || !sc-parse_cdb) {
+ret = scsi_req_parse_cdb(d, cmd, buf);
+} else {
+ret = sc-parse_cdb(d, cmd, buf, hba_private);
+}
+
 if (ret != 0) {
 trace_scsi_req_parse_bad(d-id, lun, tag, buf[0]);
 req = scsi_req_alloc(reqops_invalid_opcode, d, tag, lun, hba_private);
@@ -1190,7 +1210,7 @@ static uint64_t scsi_cmd_lba(SCSICommand *cmd)
 return lba;
 }
 
-static int scsi_req_parse(SCSICommand *cmd, SCSIDevice *dev, uint8_t *buf)
+static int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf)
 {
 int rc;
 
diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h
index 1adb549..4a0b860 100644
--- a/include/hw/scsi/scsi.h
+++ b/include/hw/scsi/scsi.h
@@ -76,6 +76,8 @@ typedef struct SCSIDeviceClass {
 DeviceClass parent_class;
 int (*init)(SCSIDevice *dev);
 void (*destroy)(SCSIDevice *s);
+int (*parse_cdb)(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
+ void *hba_private);
 SCSIRequest *(*alloc_req)(SCSIDevice *s, uint32_t tag, uint32_t lun,
   uint8_t *buf, void *hba_private);
 void (*unit_attention_reported)(SCSIDevice *s);
@@ -131,6 +133,8 @@ struct SCSIReqOps {
 struct SCSIBusInfo {
 int tcq;
 int max_channel, max_target, max_lun;
+int (*parse_cdb)(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
+ void *hba_private);
 void (*transfer_data)(SCSIRequest *req, uint32_t arg);
 void (*complete)(SCSIRequest *req, uint32_t arg, size_t resid);
 void (*cancel)(SCSIRequest *req);
@@ -244,6 +248,8 @@ void scsi_req_free(SCSIRequest *req);
 SCSIRequest *scsi_req_ref(SCSIRequest *req);
 void scsi_req_unref(SCSIRequest *req);
 
+int scsi_bus_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
+   void *hba_private);
 void scsi_req_build_sense(SCSIRequest *req, SCSISense sense);
 void scsi_req_print(SCSIRequest *req);
 void scsi_req_continue(SCSIRequest *req);
-- 
1.9.3

[Qemu-devel] [PATCH 5/5] virtio-scsi: implement parse_cdb

Enable passthrough of vendor-specific commands.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/scsi/virtio-scsi.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 0eb069a..2dd9255 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -406,6 +406,30 @@ static void virtio_scsi_command_complete(SCSIRequest *r, 
uint32_t status,
 virtio_scsi_complete_cmd_req(req);
 }
 
+static int virtio_scsi_parse_cdb(SCSIDevice *dev, SCSICommand *cmd,
+ uint8_t *buf, void *hba_private)
+{
+VirtIOSCSIReq *req = hba_private;
+
+if (cmd-len == 0) {
+cmd-len = MIN(VIRTIO_SCSI_CDB_SIZE, SCSI_CMD_BUF_SIZE);
+memcpy(cmd-buf, buf, cmd-len);
+}
+
+/* Extract the direction and mode directly from the request, for
+ * host device passthrough.
+ */
+cmd-xfer = req-qsgl.size;
+if (cmd-xfer == 0) {
+cmd-mode = SCSI_XFER_NONE;
+} else if (iov_size(req-elem.in_sg, req-elem.in_num)  req-resp_size) {
+cmd-mode = SCSI_XFER_FROM_DEV;
+} else {
+cmd-mode = SCSI_XFER_TO_DEV;
+}
+return 0;
+}
+
 static QEMUSGList *virtio_scsi_get_sg_list(SCSIRequest *r)
 {
 VirtIOSCSIReq *req = r-hba_private;
@@ -658,6 +682,7 @@ static struct SCSIBusInfo virtio_scsi_scsi_info = {
 .change = virtio_scsi_change,
 .hotplug = virtio_scsi_hotplug,
 .hot_unplug = virtio_scsi_hot_unplug,
+.parse_cdb = virtio_scsi_parse_cdb,
 .get_sg_list = virtio_scsi_get_sg_list,
 .save_request = virtio_scsi_save_request,
 .load_request = virtio_scsi_load_request,
-- 
1.9.3

Re: [Qemu-devel] [PATCH v5 2/2] vmdk: Optimize cluster allocation

On Tue, May 27, 2014 at 04:49:22PM +0800, Fam Zheng wrote:
 +if (!bs-backing_hd) {
 +memset(whole_grain, 0,  skip_start_sector  BDRV_SECTOR_BITS);
 +memset(whole_grain + (skip_end_sector  BDRV_SECTOR_BITS), 0,
 +   cluster_bytes - (skip_end_sector  BDRV_SECTOR_BITS));
 +}
 +
 +assert(skip_end_sector = sector_num + extent-cluster_sectors);

Does this assertion make sense?  skip_end_sector is a small number of
sectors (relative to start of cluster), while sector_num +
extent-cluster_sectors is a large absolute sector offset.

 +/**
 + * get_cluster_offset
 + *
 + * Look up cluster offset in extent file by sector number, and store in
 + * @cluster_offset.
 + *
 + * For flat extent, the start offset as parsed from the description file is

s/extent/extents/

 + * returned.
 + *
 + * For sparse extent, look up in L1, L2 table. If allocate is true, return an

s/extent/extents/

 + * offset for a new cluster and update L2 cache. If there is a backing file,
 + * COW is done before returning; otherwise, zeroes are written to the 
 allocated
 + * cluster. Both COW and zero writting skips the sector range

s/writting/writing/


pgpcGqEk_CN0h.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH for 2.1 V3] qemu-img info: show nocow info

On Wed, Jul 09, 2014 at 10:43:13AM +0800, Chunyan Liu wrote:
 Add nocow info in 'qemu-img info' output to show whether the file
 currently has NOCOW flag set or not.
 
 Signed-off-by: Chunyan Liu cy...@suse.com
 ---
 Changes:
   - update output info to NOCOW flag: set
 
  block/qapi.c | 25 +
  qapi/block-core.json |  5 -
  2 files changed, 29 insertions(+), 1 deletion(-)

This patch was sent on July 9th, after the 2.1 soft freeze when we stop
merging new features.  Soft freeze was 17th of June.

Please resend for QEMU 2.2 and update the qapi-schema.json version
comment.

Thanks,
Stefan


pgpnaVyvL6A24.pgp
Description: PGP signature

[Qemu-devel] [PATCH v4 0/5] ACPI fixes for QEMU 2.1

v3-v4:
drop all pretense of supporting bridges [me]

v2-v3:
fix tests/acpi-test-data/pc/DSDT [Peter]
track down make check failure, fix it [patch 4, me]
split patch 2 in two parts [mst]
do not make bsel_alloc global [mst]
include Igor's bridge patch [mst, as discussed on IRC]

Igor Mammedov (1):
  pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is
disabled

Paolo Bonzini (4):
  acpi-dsdt: procedurally generate _PRT
  pc: hack for migration compatibility from QEMU 2.0
  pc: future-proof migration-compatibility of ACPI tables
  bios-tables-test: fix ASL normalization false positive

 hw/i386/acpi-build.c|   94 +-
 hw/i386/acpi-dsdt.dsl   |   90 +-
 hw/i386/acpi-dsdt.hex.generated | 1910 +++
 hw/i386/pc_piix.c   |   19 +
 hw/i386/pc_q35.c|5 +
 include/hw/i386/pc.h|1 +
 tests/acpi-test-data/pc/DSDT|  Bin 4499 - 2807 bytes
 tests/bios-tables-test.c|6 +-
 8 files changed, 263 insertions(+), 1862 deletions(-)

-- 
1.8.3.1

[Qemu-devel] [PATCH 4/5] bios-tables-test: fix ASL normalization false positive

My version of IASL (from RHEL7) puts two newlines between the head comment
and the DefinitionBlock property.  Kill all newlines after the comment,
so that normalize_asl works properly.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 tests/bios-tables-test.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
index 62771f7..045eb27 100644
--- a/tests/bios-tables-test.c
+++ b/tests/bios-tables-test.c
@@ -487,7 +487,11 @@ static GString *normalize_asl(gchar *asl_code)
 /* strip comments (different generation days) */
 comment = g_strstr_len(asl-str, asl-len, COMMENT_END);
 if (comment) {
-asl = g_string_erase(asl, 0, comment + sizeof(COMMENT_END) - asl-str);
+comment += strlen(COMMENT_END);
+while (*comment == '\n') {
+comment++;
+}
+asl = g_string_erase(asl, 0, comment - asl-str);
 }
 
 /* strip def block name (it has file path in it) */
-- 
1.8.3.1

[Qemu-devel] [PATCH 1/5] acpi-dsdt: procedurally generate _PRT

This replaces the _PRT constant with a method that computes it.

The problem is that the DSDT+SSDT have grown from 2.0 to 2.1,
enough to cross the 8k barrier (we align the ACPI tables to 4k
before putting them in fw_cfg).  This causes problems with
migration and the pc-i440fx-2.0 machine type.

The solution to the problem is to hardcode 64k as the limit,
but this doesn't solve the bug with pc-i440fx-2.0.  The fix will be
for QEMU 2.1 to use exactly the same size as QEMU 2.0 for the
ACPI tables.  First, however, we must make the actual AML
equal or smaller; to do this, rewrite _PRT in a way that saves
over 1k of bytecode.

Reviewed-by: Laszlo Ersek ler...@redhat.com
Tested-by: Igor Mammedov imamm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/i386/acpi-dsdt.dsl   |   90 +-
 hw/i386/acpi-dsdt.hex.generated | 1910 +++
 tests/acpi-test-data/pc/DSDT|  Bin 4499 - 2807 bytes
 3 files changed, 148 insertions(+), 1852 deletions(-)

diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl
index 3cc0ea0..6ba0170 100644
--- a/hw/i386/acpi-dsdt.dsl
+++ b/hw/i386/acpi-dsdt.dsl
@@ -181,57 +181,45 @@ DefinitionBlock (
 
 Scope(\_SB) {
 Scope(PCI0) {
-Name(_PRT, Package() {
-/* PCI IRQ routing table, example from ACPI 2.0a specification,
-   section 6.2.8.1 */
-/* Note: we provide the same info as the PCI routing
-   table of the Bochs BIOS */
-
-#define prt_slot(nr, lnk0, lnk1, lnk2, lnk3) \
-Package() { nr##, 0, lnk0, 0 }, \
-Package() { nr##, 1, lnk1, 0 }, \
-Package() { nr##, 2, lnk2, 0 }, \
-Package() { nr##, 3, lnk3, 0 }
-
-#define prt_slot0(nr) prt_slot(nr, LNKD, LNKA, LNKB, LNKC)
-#define prt_slot1(nr) prt_slot(nr, LNKA, LNKB, LNKC, LNKD)
-#define prt_slot2(nr) prt_slot(nr, LNKB, LNKC, LNKD, LNKA)
-#define prt_slot3(nr) prt_slot(nr, LNKC, LNKD, LNKA, LNKB)
-
-prt_slot0(0x),
-/* Device 1 is power mgmt device, and can only use irq 9 */
-prt_slot(0x0001, LNKS, LNKB, LNKC, LNKD),
-prt_slot2(0x0002),
-prt_slot3(0x0003),
-prt_slot0(0x0004),
-prt_slot1(0x0005),
-prt_slot2(0x0006),
-prt_slot3(0x0007),
-prt_slot0(0x0008),
-prt_slot1(0x0009),
-prt_slot2(0x000a),
-prt_slot3(0x000b),
-prt_slot0(0x000c),
-prt_slot1(0x000d),
-prt_slot2(0x000e),
-prt_slot3(0x000f),
-prt_slot0(0x0010),
-prt_slot1(0x0011),
-prt_slot2(0x0012),
-prt_slot3(0x0013),
-prt_slot0(0x0014),
-prt_slot1(0x0015),
-prt_slot2(0x0016),
-prt_slot3(0x0017),
-prt_slot0(0x0018),
-prt_slot1(0x0019),
-prt_slot2(0x001a),
-prt_slot3(0x001b),
-prt_slot0(0x001c),
-prt_slot1(0x001d),
-prt_slot2(0x001e),
-prt_slot3(0x001f),
-})
+Method (_PRT, 0) {
+Store(Package(128) {}, Local0)
+Store(Zero, Local1)
+While(LLess(Local1, 128)) {
+// slot = pin  2
+Store(ShiftRight(Local1, 2), Local2)
+
+// lnk = (slot + pin)  3
+Store(And(Add(Local1, Local2), 3), Local3)
+If (LEqual(Local3, 0)) {
+Store(Package(4) { Zero, Zero, LNKD, Zero }, Local4)
+}
+If (LEqual(Local3, 1)) {
+// device 1 is the power-management device, needs SCI
+If (LEqual(Local1, 4)) {
+Store(Package(4) { Zero, Zero, LNKS, Zero }, 
Local4)
+} Else {
+Store(Package(4) { Zero, Zero, LNKA, Zero }, 
Local4)
+}
+}
+If (LEqual(Local3, 2)) {
+Store(Package(4) { Zero, Zero, LNKB, Zero }, Local4)
+}
+If (LEqual(Local3, 3)) {
+Store(Package(4) { Zero, Zero, LNKC, Zero }, Local4)
+}
+
+// Complete the interrupt routing entry:
+//Package(4) { 0x[slot], [pin], [link], 0) }
+
+Store(Or(ShiftLeft(Local2, 16), 0x), Index(Local4, 0))
+Store(And(Local1, 3),Index(Local4, 1))
+Store(Local4,Index(Local0, 
Local1))
+
+Increment(Local1)
+}
+
+Return(Local0)
+}
 }

[Qemu-devel] [PATCH 3/5] pc: future-proof migration-compatibility of ACPI tables

This patch avoids that similar changes break QEMU again in the future.
QEMU will now hard-code 64k as the maximum ACPI table size, which
(despite being an order of magnitude smaller than 640k) should be enough
for everyone.

Reviewed-by: Laszlo Ersek ler...@redhat.com
Tested-by: Igor Mammedov imamm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/i386/acpi-build.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index a3d5822..25cf297 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -62,6 +62,8 @@
 #define ACPI_BUILD_LEGACY_CPU_AML_SIZE97
 #define ACPI_BUILD_ALIGN_SIZE 0x1000
 
+#define ACPI_BUILD_TABLE_SIZE 0x1
+
 typedef struct AcpiCpuInfo {
 DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
 } AcpiCpuInfo;
@@ -1569,7 +1571,13 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 }
 g_array_set_size(tables-table_data, legacy_table_size);
 } else {
-acpi_align_size(tables-table_data, ACPI_BUILD_ALIGN_SIZE);
+if (tables-table_data-len  ACPI_BUILD_TABLE_SIZE) {
+/* As of QEMU 2.1, this fires with 160 VCPUs and 255 memory slots. 
 */
+error_report(ACPI tables are larger than 64k.  Please remove);
+error_report(CPUs, NUMA nodes, memory slots or PCI bridges.);
+exit(1);
+}
+g_array_set_size(tables-table_data, ACPI_BUILD_TABLE_SIZE);
 }
 
 acpi_align_size(tables-linker, ACPI_BUILD_ALIGN_SIZE);
-- 
1.8.3.1

[Qemu-devel] [PATCH 5/5] pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is disabled

From: Igor Mammedov imamm...@redhat.com

Fixes migration regression from QEMU-1.7 to a newer QEMUs.
SSDT table size in QEMU-1.7 doesn't change regardless of
a number of PCI bridge devices present at startup.

However in QEMU-2.0 since addition of hotplug on PCI bridges,
each PCI bridge adds ~1875 bytes to SSDT table, including
pc-i440fx-1.7 machine type where PCI bridge hotplug disabled
via compat property.
It breaks migration from QEMU-1.7 to QEMU-2.[01] -M pc-i440fx-1.7
since RAMBlock size of ACPI tables on target becomes larger
then on source and migration fails with:

Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000

error.

Fix this by generating AML only for PCI0 bus if
hotplug on PCI bridges is disabled and preserves PCI brigde
description in AML as it was done in QEMU-1.7 for pc-i440fx-1.7.

It will help to maintain size of SSDT static regardless of
number of PCI bridges on startup for pc-i440fx-1.7 machine type.

Signed-off-by: Igor Mammedov imamm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/i386/acpi-build.c | 26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 25cf297..b92c531 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -76,6 +76,7 @@ typedef struct AcpiMcfgInfo {
 typedef struct AcpiPmInfo {
 bool s3_disabled;
 bool s4_disabled;
+bool pcihp_bridge_en;
 uint8_t s4_val;
 uint16_t sci_int;
 uint8_t acpi_enable_cmd;
@@ -97,6 +98,7 @@ typedef struct AcpiBuildPciBusHotplugState {
 GArray *device_table;
 GArray *notify_table;
 struct AcpiBuildPciBusHotplugState *parent;
+bool pcihp_bridge_en;
 } AcpiBuildPciBusHotplugState;
 
 static void acpi_get_dsdt(AcpiMiscInfo *info)
@@ -200,6 +202,9 @@ static void acpi_get_pm_info(AcpiPmInfo *pm)
NULL);
 pm-gpe0_blk_len = object_property_get_int(obj, ACPI_PM_PROP_GPE0_BLK_LEN,
NULL);
+pm-pcihp_bridge_en =
+object_property_get_bool(obj, acpi-pci-hotplug-with-bridge-support,
+ NULL);
 }
 
 static void acpi_get_misc_info(AcpiMiscInfo *info)
@@ -780,11 +785,13 @@ static void acpi_set_pci_info(void)
 }
 
 static void build_pci_bus_state_init(AcpiBuildPciBusHotplugState *state,
- AcpiBuildPciBusHotplugState *parent)
+ AcpiBuildPciBusHotplugState *parent,
+ bool pcihp_bridge_en)
 {
 state-parent = parent;
 state-device_table = build_alloc_array();
 state-notify_table = build_alloc_array();
+state-pcihp_bridge_en = pcihp_bridge_en;
 }
 
 static void build_pci_bus_state_cleanup(AcpiBuildPciBusHotplugState *state)
@@ -798,7 +805,7 @@ static void *build_pci_bus_begin(PCIBus *bus, void 
*parent_state)
 AcpiBuildPciBusHotplugState *parent = parent_state;
 AcpiBuildPciBusHotplugState *child = g_malloc(sizeof *child);
 
-build_pci_bus_state_init(child, parent);
+build_pci_bus_state_init(child, parent, parent-pcihp_bridge_en);
 
 return child;
 }
@@ -819,6 +826,14 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
 GArray *method;
 bool bus_hotplug_support = false;
 
+/*
+skip bridge subtree creation if bridge hotplug is disabled
+to make it compatible with 1.7 machine type
+*/
+if (!child-pcihp_bridge_en  bus-parent_dev) {
+return;
+}
+
 if (bus-parent_dev) {
 op = 0x82; /* DeviceOp */
 build_append_nameseg(bus_table, S%.02X_,
@@ -865,7 +880,8 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
 pc = PCI_DEVICE_GET_CLASS(pdev);
 dc = DEVICE_GET_CLASS(pdev);
 
-if (pc-class_id == PCI_CLASS_BRIDGE_ISA || pc-is_bridge) {
+if (pc-class_id == PCI_CLASS_BRIDGE_ISA ||
+(pc-is_bridge  child-pcihp_bridge_en)) {
 set_bit(slot, slot_device_system);
 }
 
@@ -877,7 +893,7 @@ static void build_pci_bus_end(PCIBus *bus, void *bus_state)
 }
 }
 
-if (!dc-hotpluggable || pc-is_bridge) {
+if (!dc-hotpluggable || (pc-is_bridge  child-pcihp_bridge_en)) {
 clear_bit(slot, slot_hotplug_enable);
 }
 }
@@ -1142,7 +1158,7 @@ build_ssdt(GArray *table_data, GArray *linker,
 bus = PCI_HOST_BRIDGE(pci_host)-bus;
 }
 
-build_pci_bus_state_init(hotplug_state, NULL);
+build_pci_bus_state_init(hotplug_state, NULL, 
pm-pcihp_bridge_en);
 
 if (bus) {
 /* Scan all PCI buses. Generate tables to support hotplug. */
-- 
1.8.3.1

[Qemu-devel] [PATCH 2/5] pc: hack for migration compatibility from QEMU 2.0

Changing the ACPI table size causes migration to break, and the memory
hotplug work opened our eyes on how horribly we were breaking things in
2.0 already.

The ACPI table size is rounded to the next 4k, which one would think
gives some headroom.  In practice this is not the case, because the user
can control the ACPI table size (each CPU adds 97 bytes to the SSDT and
8 to the MADT) and so some -smp values will break the 4k boundary and
fail to migrate.  Similarly, PCI bridges add ~1870 bytes to the SSDT.

This patch concerns itself with fixing migration from QEMU 2.0.  It
computes the payload size of QEMU 2.0 and always uses that one.
The previous patch shrunk the ACPI tables enough that the QEMU 2.0 size
should always be enough; non-AML tables can change depending on the
configuration (especially MADT, SRAT, HPET) but they remain the same
between QEMU 2.0 and 2.1, so we only compute our padding based on the
sizes of the SSDT and DSDT.

Migration from QEMU 1.7 should work for guests that have a number of CPUs
other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140.  It was already
broken from QEMU 1.7 to QEMU 2.0 in the same way, though.

Even with this patch, QEMU 1.7 and 2.0 have two different ideas of
-M pc-i440fx-2.0 when there are PCI bridges.  Igor sent a patch to
adopt the QEMU 1.7 definition.  I think distributions should apply
it if they move directly from QEMU 1.7 to 2.1+ without ever packaging
version 2.0.

Reviewed-by: Laszlo Ersek ler...@redhat.com
Tested-by: Igor Mammedov imamm...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/i386/acpi-build.c | 60 
 hw/i386/pc_piix.c| 19 +
 hw/i386/pc_q35.c |  5 +
 include/hw/i386/pc.h |  1 +
 4 files changed, 81 insertions(+), 4 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ebc5f03..a3d5822 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -25,7 +25,9 @@
 #include glib.h
 #include qemu-common.h
 #include qemu/bitmap.h
+#include qemu/osdep.h
 #include qemu/range.h
+#include qemu/error-report.h
 #include hw/pci/pci.h
 #include qom/cpu.h
 #include hw/i386/pc.h
@@ -52,6 +54,14 @@
 #include qapi/qmp/qint.h
 #include qom/qom-qobject.h
 
+/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
+ * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
+ * a little bit, there should be plenty of free space since the DSDT
+ * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1.
+ */
+#define ACPI_BUILD_LEGACY_CPU_AML_SIZE97
+#define ACPI_BUILD_ALIGN_SIZE 0x1000
+
 typedef struct AcpiCpuInfo {
 DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
 } AcpiCpuInfo;
@@ -1440,13 +1450,14 @@ static
 void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
 {
 GArray *table_offsets;
-unsigned facs, dsdt, rsdt;
+unsigned facs, ssdt, dsdt, rsdt;
 AcpiCpuInfo cpu;
 AcpiPmInfo pm;
 AcpiMiscInfo misc;
 AcpiMcfgInfo mcfg;
 PcPciInfo pci;
 uint8_t *u;
+size_t aml_len = 0;
 
 acpi_get_cpu_info(cpu);
 acpi_get_pm_info(pm);
@@ -1474,13 +1485,20 @@ void acpi_build(PcGuestInfo *guest_info, 
AcpiBuildTables *tables)
 dsdt = tables-table_data-len;
 build_dsdt(tables-table_data, tables-linker, misc);
 
+/* Count the size of the DSDT and SSDT, we will need it for legacy
+ * sizing of ACPI tables.
+ */
+aml_len += tables-table_data-len - dsdt;
+
 /* ACPI tables pointed to by RSDT */
 acpi_add_table(table_offsets, tables-table_data);
 build_fadt(tables-table_data, tables-linker, pm, facs, dsdt);
 
+ssdt = tables-table_data-len;
 acpi_add_table(table_offsets, tables-table_data);
 build_ssdt(tables-table_data, tables-linker, cpu, pm, misc, pci,
guest_info);
+aml_len += tables-table_data-len - ssdt;
 
 acpi_add_table(table_offsets, tables-table_data);
 build_madt(tables-table_data, tables-linker, cpu, guest_info);
@@ -1513,14 +1531,45 @@ void acpi_build(PcGuestInfo *guest_info, 
AcpiBuildTables *tables)
 /* RSDP is in FSEG memory, so allocate it separately */
 build_rsdp(tables-rsdp, tables-linker, rsdt);
 
-/* We'll expose it all to Guest so align size to reduce
+/* We'll expose it all to Guest so we want to reduce
  * chance of size changes.
  * RSDP is small so it's easy to keep it immutable, no need to
  * bother with alignment.
+ *
+ * We used to align the tables to 4k, but of course this would
+ * too simple to be enough.  4k turned out to be too small an
+ * alignment very soon, and in fact it is almost impossible to
+ * keep the table size stable for all (max_cpus, max_memory_slots)
+ * combinations.  So the table size is always 64k for pc-i440fx-2.1
+ * and we give an error if the table grows beyond that limit.
+ *
+ * We still have the problem of migrating from -M pc-i440fx-2.0.  For
+ * that, we exploit

[Qemu-devel] [PATCH 1/3] loader: add support for resizeable blobs

Support resizeable blobs: we allocate more memory than currently
available in the blob, which can later be filled in.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 include/hw/loader.h   | 14 +++--
 include/hw/nvram/fw_cfg.h |  2 +-
 hw/core/loader.c  | 15 +
 hw/nvram/fw_cfg.c | 79 ---
 4 files changed, 96 insertions(+), 14 deletions(-)

diff --git a/include/hw/loader.h b/include/hw/loader.h
index 796cbf9..ecce654 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -55,9 +55,19 @@ extern bool rom_file_has_mr;
 int rom_add_file(const char *file, const char *fw_dir,
  hwaddr addr, int32_t bootindex,
  bool option_rom);
-void *rom_add_blob(const char *name, const void *blob, size_t len,
+void *rom_add_blob_resizeable(const char *name, const void *blob,
+  size_t len, size_t max_len,
+  hwaddr addr, const char *fw_file_name,
+  FWCfgReadCallback fw_callback,
+  void *callback_opaque);
+static inline void *rom_add_blob(const char *name, const void *blob, size_t 
len,
hwaddr addr, const char *fw_file_name,
-   FWCfgReadCallback fw_callback, void *callback_opaque);
+   FWCfgReadCallback fw_callback, void *callback_opaque)
+{
+return rom_add_blob_resizeable(name, blob, len, len, addr,
+   fw_file_name, fw_callback, callback_opaque);
+}
+
 int rom_add_elf_program(const char *name, void *data, size_t datasize,
 size_t romsize, hwaddr addr);
 int rom_load_all(void);
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 72b1549..da4f5c5 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -75,7 +75,7 @@ void fw_cfg_add_file(FWCfgState *s, const char *filename, 
void *data,
  size_t len);
 void fw_cfg_add_file_callback(FWCfgState *s, const char *filename,
   FWCfgReadCallback callback, void 
*callback_opaque,
-  void *data, size_t len);
+  void *data, size_t len, size_t max_len);
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
 hwaddr crl_addr, hwaddr data_addr);
 
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 2bf6b8f..ad6ec67 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -720,9 +720,11 @@ err:
 return -1;
 }
 
-void *rom_add_blob(const char *name, const void *blob, size_t len,
-   hwaddr addr, const char *fw_file_name,
-   FWCfgReadCallback fw_callback, void *callback_opaque)
+void *rom_add_blob_resizeable(const char *name, const void *blob,
+  size_t len, size_t max_len,
+  hwaddr addr, const char *fw_file_name,
+  FWCfgReadCallback fw_callback,
+  void *callback_opaque)
 {
 Rom *rom;
 void *data = NULL;
@@ -730,9 +732,10 @@ void *rom_add_blob(const char *name, const void *blob, 
size_t len,
 rom   = g_malloc0(sizeof(*rom));
 rom-name = g_strdup(name);
 rom-addr = addr;
-rom-romsize  = len;
-rom-datasize = len;
+rom-romsize  = max_len;
+rom-datasize = max_len;
 rom-data = g_malloc0(rom-datasize);
+assert(len = rom-datasize);
 memcpy(rom-data, blob, len);
 rom_insert(rom);
 if (fw_file_name  fw_cfg) {
@@ -748,7 +751,7 @@ void *rom_add_blob(const char *name, const void *blob, 
size_t len,
 
 fw_cfg_add_file_callback(fw_cfg, fw_file_name,
  fw_callback, callback_opaque,
- data, rom-romsize);
+ data, rom-romsize, rom-datasize);
 }
 return data;
 }
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index b71d251..65f233e 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -38,6 +38,8 @@
 #define FW_CFG(obj) OBJECT_CHECK(FWCfgState, (obj), TYPE_FW_CFG)
 
 typedef struct FWCfgEntry {
+uint32_t max_len;
+uint32_t reset_len;
 uint32_t len;
 uint8_t *data;
 void *callback_opaque;
@@ -57,6 +59,9 @@ struct FWCfgState {
 uint16_t cur_entry;
 uint32_t cur_offset;
 Notifier machine_ready;
+/* Entry length: used for migration */
+#define FW_CFG_LEN_ENTRIES (2 * FW_CFG_MAX_ENTRY)
+uint32_t len[FW_CFG_LEN_ENTRIES];
 };
 
 #define JPG_FILE 0
@@ -336,6 +341,13 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops = {
 static void fw_cfg_reset(DeviceState *d)
 {
 FWCfgState *s = FW_CFG(d);
+int i, j;
+
+for (i = 0; i  ARRAY_SIZE(s-entries); ++i) {
+for (j = 0; j  ARRAY_SIZE(s-entries[0]); ++j) {
+s-entries[i][j].len = s-entries[i][j].reset_len;
+}
+}
 
 fw_cfg_select(s, 0);

[Qemu-devel] [PATCH 2/3] migration: load smaller RAMBlock to a bigger one if permitted

From: Igor Mammedov imamm...@redhat.com

Add API to mark memory region as extend-able on migration,
to allow migration code to load smaller RAMBlock into
a bigger one on destination QEMU instance.

This will allow to fix broken migration from QEMU 1.7/2.0 to
QEMU 2.1 due to  ACPI tables size changes across 1.7/2.0/2.1
versions by marking ACPI tables ROM blob as extend-able.
So that smaller tables from previous version could be always
migrated to a bigger rom blob on new version.

Credits-for-idea: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Igor Mammedov imamm...@redhat.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 include/exec/memory.h   | 11 +++
 include/exec/ram_addr.h |  3 +++
 arch_init.c | 22 +-
 exec.c  |  8 
 memory.c|  5 +
 5 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index e2c8e3e..f96ddbb 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -894,6 +894,17 @@ bool memory_region_present(MemoryRegion *container, hwaddr 
addr);
 bool memory_region_is_mapped(MemoryRegion *mr);
 
 /**
+ * memory_region_permit_extendable_migration: marks #MemoryRegion
+ * as extendable on migration, allowing the migration code to load
+ * source memory block of smaller size than destination memory block
+ * at migration time
+ *
+ * @mr: a #MemoryRegion whose #RAMBlock should be marked as
+ *  extendable on migration
+ */
+void memory_region_permit_extendable_migration(MemoryRegion *mr);
+
+/**
  * memory_region_find: translate an address/size relative to a
  * MemoryRegion into a #MemoryRegionSection.
  *
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 6593be1..7a6b782 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -34,6 +34,9 @@ void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
 void qemu_ram_free_from_ptr(ram_addr_t addr);
 
+#define RAM_EXTENDABLE_ON_MIGRATE (1U  31)
+void qemu_ram_set_extendable_on_migration(ram_addr_t addr);
+
 static inline bool cpu_physical_memory_get_dirty(ram_addr_t start,
  ram_addr_t length,
  unsigned client)
diff --git a/arch_init.c b/arch_init.c
index 8ddaf35..2c0c238 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -1071,11 +1071,23 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 
 QTAILQ_FOREACH(block, ram_list.blocks, next) {
 if (!strncmp(id, block-idstr, sizeof(id))) {
-if (block-length != length) {
-error_report(Length mismatch: %s:  RAM_ADDR_FMT
-  in !=  RAM_ADDR_FMT, id, length,
- block-length);
-ret =  -EINVAL;
+if (block-flags  RAM_EXTENDABLE_ON_MIGRATE) {
+if (block-length  length) {
+error_report(Length too big: %s:  
RAM_ADDR_FMT
+  in   RAM_ADDR_FMT, id, length,
+ block-length);
+ret =  -EINVAL;
+} else {
+memset(block-host, 0, block-length);
+}
+} else {
+if (block-length != length) {
+error_report(Length mismatch: %s: 
+ RAM_ADDR_FMT  in != 
+ RAM_ADDR_FMT,
+ id, length, block-length);
+ret =  -EINVAL;
+}
 }
 break;
 }
diff --git a/exec.c b/exec.c
index 765bd94..02536f8e 100644
--- a/exec.c
+++ b/exec.c
@@ -1214,6 +1214,14 @@ void qemu_ram_unset_idstr(ram_addr_t addr)
 }
 }
 
+void qemu_ram_set_extendable_on_migration(ram_addr_t addr)
+{
+RAMBlock *block = find_ram_block(addr);
+
+assert(block != NULL);
+block-flags |= RAM_EXTENDABLE_ON_MIGRATE;
+}
+
 static int memory_try_enable_merging(void *addr, size_t len)
 {
 if (!qemu_opt_get_bool(qemu_get_machine_opts(), mem-merge, true)) {
diff --git a/memory.c b/memory.c
index 64d7176..744c746 100644
--- a/memory.c
+++ b/memory.c
@@ -1791,6 +1791,11 @@ bool memory_region_is_mapped(MemoryRegion *mr)
 return mr-container ? true : false;
 }
 
+void memory_region_permit_extendable_migration(MemoryRegion *mr)
+{
+qemu_ram_set_extendable_on_migration(mr-ram_addr);
+}
+
 MemoryRegionSection memory_region_find(MemoryRegion *mr,
hwaddr addr,

[Qemu-devel] [PATCH 3/3] loader: mark MR for resizeable blobs as extendable

This makes migration from older QEMU versions
more robust.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/core/loader.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/core/loader.c b/hw/core/loader.c
index ad6ec67..fc00a87 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -745,6 +745,13 @@ void *rom_add_blob_resizeable(const char *name, const void 
*blob,
 
 if (rom_file_has_mr) {
 data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
+/* If there's padding at tail, blob is resizeable.
+ * Set flag to allow migration from older QEMU versions
+ * where this region could have been smaller.
+ */
+if (max_len != len) {
+memory_region_permit_extendable_migration(rom-mr);
+}
 } else {
 data = rom-data;
 }
-- 
MST

Re: [Qemu-devel] [PATCH 3/5] pc: future-proof migration-compatibility of ACPI tables

On Mon, Jul 28, 2014 at 05:34:16PM +0200, Paolo Bonzini wrote:
 This patch avoids that similar changes break QEMU again in the future.
 QEMU will now hard-code 64k as the maximum ACPI table size, which
 (despite being an order of magnitude smaller than 640k) should be enough
 for everyone.

Famous last words :) So what worries me here, is that we
are potentially breaking legal configurations for the
benefit of the minority that cares about cross-version
migration.

So I'm inclined to apply everything except this patch, and
instead, use the patches that I sent to make the
ram block very large, something like 1 Megabyte.

This localizes the pain to cross-version migration.


 
 Reviewed-by: Laszlo Ersek ler...@redhat.com
 Tested-by: Igor Mammedov imamm...@redhat.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/i386/acpi-build.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)
 
 diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
 index a3d5822..25cf297 100644
 --- a/hw/i386/acpi-build.c
 +++ b/hw/i386/acpi-build.c
 @@ -62,6 +62,8 @@
  #define ACPI_BUILD_LEGACY_CPU_AML_SIZE97
  #define ACPI_BUILD_ALIGN_SIZE 0x1000
  
 +#define ACPI_BUILD_TABLE_SIZE 0x1
 +
  typedef struct AcpiCpuInfo {
  DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
  } AcpiCpuInfo;
 @@ -1569,7 +1571,13 @@ void acpi_build(PcGuestInfo *guest_info, 
 AcpiBuildTables *tables)
  }
  g_array_set_size(tables-table_data, legacy_table_size);
  } else {
 -acpi_align_size(tables-table_data, ACPI_BUILD_ALIGN_SIZE);
 +if (tables-table_data-len  ACPI_BUILD_TABLE_SIZE) {
 +/* As of QEMU 2.1, this fires with 160 VCPUs and 255 memory 
 slots.  */
 +error_report(ACPI tables are larger than 64k.  Please remove);
 +error_report(CPUs, NUMA nodes, memory slots or PCI bridges.);
 +exit(1);
 +}
 +g_array_set_size(tables-table_data, ACPI_BUILD_TABLE_SIZE);
  }
  
  acpi_align_size(tables-linker, ACPI_BUILD_ALIGN_SIZE);
 -- 
 1.8.3.1

Re: [Qemu-devel] nic is lost, in the virtual machine running

On Sat, Jul 19, 2014 at 03:31:00PM +0800, melolinux wrote:
 1. /usr/bin/qemu-system-x86_64 -machine accel=kvm -name 
 a58970a2-10aa-4973-9261-7384b2f53221 -S -machine pc-0.14,accel=kvm,usb=off -m 
 4096 -smp 4,sockets=1,cores=4,threads=1 -uuid 
 f3a2217f-ae5c-461b-a922-d37a3a98edc6 -no-user-config -nodefaults -chardev 
 socket,id=charmonitor,path=/var/lib/libvirt/qemu/a58970a2-10aa-4973-9261-7384b2f53221.monitor,server,nowait
  -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime 
 -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
 virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive 
 file=/dev/493bdd93-6eb4-4499-a344-99cd0abef251/95eb250a-05ab-4da1-8673-8a616b4d0a6e,if=none,id=drive-ide0-0-0,format=qcow2,cache=writeback
  -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 
 -drive 
 file=/dev/493bdd93-6eb4-4499-a344-99cd0abef251/27d2df8a-dc00-4f6a-8669-f5dffc7a92d8,if=none,id=drive-ide0-0-1,format=qcow2,cache=writeback
  -device ide-hd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -netdev 
 tap,ifname=tapa589700,script=no,id=hostnet0 -device 
 e1000,netdev=hostnet0,id=net0,mac=c6:b0:5c:de:54:39,bus=pci.0,addr=0x3 
 -chardev spicevmc,id=charchannel0,name=vdagent -device 
 virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
  -device usb-tablet,id=input0 -spice 
 port=5900,addr=10.9.40.213,disable-ticketing,seamless-migration=on -vga qxl 
 -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=9437184 -device 
 intel-hda,id=sound0,bus=pci.0,addr=0x4 -device 
 hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device 
 virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
 
 
 2. Problem found :the client can't find the network device (It takes a long 
 time since we create the vm)
 
 3. we use qemu-monitor-commond --hmp xx info pci to check it and find that 
 we lost the ethernet controller.

There is not enough information in #2 and #3 to say what is going on.

What does lspci report inside the guest?

Does rmmod and modprobe of the NIC driver inside the guest bring the
interface back?

Did you check that QEMU's info pci reports the NIC when the guest has
booted but doesn't report it when the network becomes unavailable?  If
yes, then please check the libvirt log for hot unplug events
(/var/log/libvirt).

Stefan


pgpGud3TZbN1S.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 3/5] pc: future-proof migration-compatibility of ACPI tables

Il 28/07/2014 17:59, Michael S. Tsirkin ha scritto:
 On Mon, Jul 28, 2014 at 05:34:16PM +0200, Paolo Bonzini wrote:
 This patch avoids that similar changes break QEMU again in the future.
 QEMU will now hard-code 64k as the maximum ACPI table size, which
 (despite being an order of magnitude smaller than 640k) should be enough
 for everyone.
 
 Famous last words :) So what worries me here, is that we
 are potentially breaking legal configurations for the
 benefit of the minority that cares about cross-version
 migration.
 
 So I'm inclined to apply everything except this patch, and
 instead, use the patches that I sent to make the
 ram block very large, something like 1 Megabyte.

Even just 128k are enough for 160 VCPUs, 255 memory slots and 35-40 PCI
bridges.  And for 2.2 I'd rather move to the other model where all
user-defined elements (MADT, SSDT) are in a separate file and we
guarantee that *all* changes are versioned by machine type.

What do you think about just changing 64k-128k?  Your patch is a huge
amount of code for -rc4.

Paolo

Re: [Qemu-devel] [Bug 1347387] [NEW] while i was created the new virtual machine using qemu the following error was shown in fedora version 20

On Wed, Jul 23, 2014 at 04:43:27AM -, selvakumar wrote:
 Public bug reported:
 
 [root@localhost pkgs]# qemu-img create virtualdisk.img 100M
 qemu-img: symbol lookup error: qemu-img: undefined symbol: glfs_discard_async

CCing Fedora qemu-kvm package maintainer.

Stefan

 [root@localhost pkgs]# qemu-i386 create virtualdisk.img 100M
 Error while loading create: No such file or directory
 
 [root@localhost pkgs]# rpm -qa qemu-kvm libvirt
 qemu-kvm-1.6.2-6.fc20.x86_64
 libvirt-1.1.3.5-2.fc20.x86_64
 [root@localhost pkgs]# 
 
 [root@localhost pkgs]# rpm -qa|grep *qemu*
 qemu-system-m68k-1.6.2-6.fc20.x86_64
 qemu-kvm-1.6.2-6.fc20.x86_64
 qemu-system-microblaze-1.6.2-6.fc20.x86_64
 ipxe-roms-qemu-20130517-3.gitc4bce43.fc20.noarch
 qemu-common-1.6.2-6.fc20.x86_64
 qemu-system-sh4-1.6.2-6.fc20.x86_64
 qemu-system-sparc-1.6.2-6.fc20.x86_64
 qemu-system-lm32-1.6.2-6.fc20.x86_64
 qemu-img-1.6.2-6.fc20.x86_64
 qemu-system-s390x-1.6.2-6.fc20.x86_64
 qemu-system-cris-1.6.2-6.fc20.x86_64
 qemu-1.6.2-6.fc20.x86_64
 qemu-system-xtensa-1.6.2-6.fc20.x86_64
 qemu-system-moxie-1.6.2-6.fc20.x86_64
 qemu-system-ppc-1.6.2-6.fc20.x86_64
 libvirt-daemon-driver-qemu-1.1.3.5-2.fc20.x86_64
 qemu-system-mips-1.6.2-6.fc20.x86_64
 qemu-system-alpha-1.6.2-6.fc20.x86_64
 qemu-guest-agent-1.6.1-2.fc20.x86_64
 qemu-user-1.6.2-6.fc20.x86_64
 qemu-system-x86-1.6.2-6.fc20.x86_64
 qemu-system-arm-1.6.2-6.fc20.x86_64
 qemu-system-unicore32-1.6.2-6.fc20.x86_64
 qemu-system-or32-1.6.2-6.fc20.x86_64
 [root@localhost pkgs]#
 
 ** Affects: qemu
  Importance: Undecided
  Status: New
 
 -- 
 You received this bug notification because you are a member of qemu-
 devel-ml, which is subscribed to QEMU.
 https://bugs.launchpad.net/bugs/1347387
 
 Title:
   while i was created the new virtual machine using qemu the following
   error was shown in fedora version 20
 
 Status in QEMU:
   New
 
 Bug description:
   [root@localhost pkgs]# qemu-img create virtualdisk.img 100M
   qemu-img: symbol lookup error: qemu-img: undefined symbol: 
 glfs_discard_async
   [root@localhost pkgs]# qemu-i386 create virtualdisk.img 100M
   Error while loading create: No such file or directory
 
   [root@localhost pkgs]# rpm -qa qemu-kvm libvirt
   qemu-kvm-1.6.2-6.fc20.x86_64
   libvirt-1.1.3.5-2.fc20.x86_64
   [root@localhost pkgs]# 
 
   [root@localhost pkgs]# rpm -qa|grep *qemu*
   qemu-system-m68k-1.6.2-6.fc20.x86_64
   qemu-kvm-1.6.2-6.fc20.x86_64
   qemu-system-microblaze-1.6.2-6.fc20.x86_64
   ipxe-roms-qemu-20130517-3.gitc4bce43.fc20.noarch
   qemu-common-1.6.2-6.fc20.x86_64
   qemu-system-sh4-1.6.2-6.fc20.x86_64
   qemu-system-sparc-1.6.2-6.fc20.x86_64
   qemu-system-lm32-1.6.2-6.fc20.x86_64
   qemu-img-1.6.2-6.fc20.x86_64
   qemu-system-s390x-1.6.2-6.fc20.x86_64
   qemu-system-cris-1.6.2-6.fc20.x86_64
   qemu-1.6.2-6.fc20.x86_64
   qemu-system-xtensa-1.6.2-6.fc20.x86_64
   qemu-system-moxie-1.6.2-6.fc20.x86_64
   qemu-system-ppc-1.6.2-6.fc20.x86_64
   libvirt-daemon-driver-qemu-1.1.3.5-2.fc20.x86_64
   qemu-system-mips-1.6.2-6.fc20.x86_64
   qemu-system-alpha-1.6.2-6.fc20.x86_64
   qemu-guest-agent-1.6.1-2.fc20.x86_64
   qemu-user-1.6.2-6.fc20.x86_64
   qemu-system-x86-1.6.2-6.fc20.x86_64
   qemu-system-arm-1.6.2-6.fc20.x86_64
   qemu-system-unicore32-1.6.2-6.fc20.x86_64
   qemu-system-or32-1.6.2-6.fc20.x86_64
   [root@localhost pkgs]#
 
 To manage notifications about this bug go to:
 https://bugs.launchpad.net/qemu/+bug/1347387/+subscriptions
 


pgpgvyCSnx7qc.pgp
Description: PGP signature

[Qemu-devel] Dynamic QEMU platform device instantiation in machine files: phone call on Wed July 30

2014-07-28 Thread Eric Auger

Dear all,

For your information, a phone call will be held this week on Wed July
30, 17h-18h CET to address the topic of dynamic instantiation of QEMU
platform devices in machine files (using the -device qemu option).

Related threads are:
http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg01019.html

The objective of the meeting is to discuss (and hopefully reach a
consensus) on where we can put the code associated to platform device dt
node generation and also qom binding stuff.

In case you are interested in this discussion, please contact me and I
will send you the phone call details.

Best Regards

Eric

Re: [Qemu-devel] [PATCH v3] docs/multiple-iothreads.txt: add documentation on IOThread programming

On Wed, Jul 23, 2014 at 12:55:32PM +0100, Stefan Hajnoczi wrote:
 This document explains how IOThreads and the main loop are related,
 especially how to write code that can run in an IOThread.  Currently
 only virtio-blk-data-plane uses these techniques.  The next obvious
 target is virtio-scsi; there has also been work on virtio-net.
 
 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
 ---
  docs/multiple-iothreads.txt | 134 
 
  1 file changed, 134 insertions(+)
  create mode 100644 docs/multiple-iothreads.txt

Applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgpG3fkA8n1To.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH for-2.1] qemu-options: fix another allows-to for -net l2tpv3

On Thu, Jul 24, 2014 at 08:11:26PM +0400, Michael Tokarev wrote:
 Signed-off-by: Michael Tokarev m...@tls.msk.ru
 ---
  qemu-options.hx |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/qemu-options.hx b/qemu-options.hx
 index 9e54686..1549625 100644
 --- a/qemu-options.hx
 +++ b/qemu-options.hx
 @@ -1437,7 +1437,7 @@ DEF(net, HAS_ARG, QEMU_OPTION_net,
  -net 
 l2tpv3[,vlan=n][,name=str],src=srcaddr,dst=dstaddr[,srcport=srcport][,dstport=dstport],txsession=txsession[,rxsession=rxsession][,ipv6=on/off][,udp=on/off][,cookie64=on/off][,counter][,pincounter][,txcookie=txcookie][,rxcookie=rxcookie][,offset=offset]\n
  connect the VLAN to an Ethernet over L2TPv3 
 pseudowire\n
  Linux kernel 3.3+ as well as most routers can talk\n
 -L2TPv3. This transport allows to connect a VM to a 
 VM,\n
 +L2TPv3. This transport allows connecting a VM to a 
 VM,\n

Reviewed-by: Stefan Hajnoczi stefa...@redhat.com


pgpkBKzxPozae.pgp
Description: PGP signature

[Qemu-devel] [PATCH v2 1/4] parallels: extend parallels format header with actual data values

Parallels image format has several additional fields inside:
- nb_sectors is actually 64 bit wide. Upper 32bits are not used for
  images with signature WithoutFreeSpace and must be explicitly
  zeroed according to Parallels. They will be used for images with
  signature WithouFreSpacExt
- inuse is magic which means that the image is currently opened for
  read/write or was not closed correctly, the magic is 0x746f6e59
- data_off is the location of the first data block. It can be zero
  and in this case data starts just beyond the header aligned to
  512 bytes. Though this field does not matter for read-only driver

This patch adds these values to struct parallels_header and adds
proper handling of nb_sectors for currently supported WithoutFreeSpace
images.

WithouFreSpacExt will be covered in next patches.

Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Kevin Wolf kw...@redhat.com
CC: Stefan Hajnoczi stefa...@redhat.com
CC: Jeff Cody jc...@redhat.com
---
 block/parallels.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 1a5bd35..e39 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -41,8 +41,10 @@ struct parallels_header {
 uint32_t cylinders;
 uint32_t tracks;
 uint32_t catalog_entries;
-uint32_t nb_sectors;
-char padding[24];
+uint64_t nb_sectors;
+uint32_t inuse;
+uint32_t data_off;
+char padding[12];
 } QEMU_PACKED;
 
 typedef struct BDRVParallelsState {
@@ -90,7 +92,7 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 
-bs-total_sectors = le32_to_cpu(ph.nb_sectors);
+bs-total_sectors = 0x  le64_to_cpu(ph.nb_sectors);
 
 s-tracks = le32_to_cpu(ph.tracks);
 if (s-tracks == 0) {
-- 
1.9.1

[Qemu-devel] [PATCH v2 2/4] parallels: replace tabs with spaces in block/parallels.c

Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Jeff Cody jc...@redhat.com
CC: Kevin Wolf kw...@redhat.com
CC: Stefan Hajnoczi stefa...@redhat.com
---
 block/parallels.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index e39..16d14ad 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -61,11 +61,11 @@ static int parallels_probe(const uint8_t *buf, int 
buf_size, const char *filenam
 const struct parallels_header *ph = (const void *)buf;
 
 if (buf_size  HEADER_SIZE)
-   return 0;
+return 0;
 
 if (!memcmp(ph-magic, HEADER_MAGIC, 16) 
-   (le32_to_cpu(ph-version) == HEADER_VERSION))
-   return 100;
+(le32_to_cpu(ph-version) == HEADER_VERSION))
+return 100;
 
 return 0;
 }
@@ -115,7 +115,7 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 
 for (i = 0; i  s-catalog_size; i++)
-   le32_to_cpus(s-catalog_bitmap[i]);
+le32_to_cpus(s-catalog_bitmap[i]);
 
 qemu_co_mutex_init(s-lock);
 return 0;
@@ -135,7 +135,7 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t 
sector_num)
 
 /* not allocated */
 if ((index  s-catalog_size) || (s-catalog_bitmap[index] == 0))
-   return -1;
+return -1;
 return (uint64_t)(s-catalog_bitmap[index] + offset) * 512;
 }
 
-- 
1.9.1

[Qemu-devel] [PATCH v2 3/4] parallels: split check for parallels format in parallels_open

and rework error path a bit. There is no difference at the moment, but
the code will be definitely shorter when additional processing will
be required for WithouFreSpacExt

Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Jeff Cody jc...@redhat.com
CC: Kevin Wolf kw...@redhat.com
CC: Stefan Hajnoczi stefa...@redhat.com
---
 block/parallels.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 16d14ad..466705e 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -85,11 +85,11 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 
-if (memcmp(ph.magic, HEADER_MAGIC, 16) ||
-(le32_to_cpu(ph.version) != HEADER_VERSION)) {
-error_setg(errp, Image not in Parallels format);
-ret = -EINVAL;
-goto fail;
+if (le32_to_cpu(ph.version) != HEADER_VERSION) {
+goto fail_format;
+}
+if (memcmp(ph.magic, HEADER_MAGIC, 16)) {
+goto fail_format;
 }
 
 bs-total_sectors = 0x  le64_to_cpu(ph.nb_sectors);
@@ -120,6 +120,9 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 qemu_co_mutex_init(s-lock);
 return 0;
 
+fail_format:
+error_setg(errp, Image not in Parallels format);
+ret = -EINVAL;
 fail:
 g_free(s-catalog_bitmap);
 return ret;
-- 
1.9.1

[Qemu-devel] [PATCH v2 0/4] block/parallels: 2TB+ parallels images support

Parallels has released in the recent updates of Parallels Server 5/6
new addition to his image format. Images with signature WithouFreSpacExt
have offsets in the catalog coded not as offsets in sectors (multiple
of 512 bytes) but offsets coded in blocks (i.e. header-tracks * 512)

In this case to code the virtual disk size for such images nb_sectors
field is extended to 64 bits. The reader of older images with signature
WithoutFreeSpace must manually zero most valuable bits of nb_sectors
on open.

Changes from v1:
- fixed message in patch 1
- added braces to conform qemu coding style in patches 3  4
- added check for ph.tracks in patch 4 to avoid offset overflow as suggested
  by Jeff

Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Kevin Wolf kw...@redhat.com
CC: Stefan Hajnoczi stefa...@redhat.com
CC: Jeff Cody jc...@redhat.com

[Qemu-devel] [PATCH v2 4/4] parallels: 2TB+ parallels images support