Re: [Qemu-block] [PATCH v2 3/5] pci: Add INTERFACE_PCIE_DEVICE to all PCIe devices

2017-09-27 Thread David Gibson
On Wed, Sep 27, 2017 at 04:56:33PM -0300, Eduardo Habkost wrote:
> Change all devices that set is_express=1 to implement
> INTERFACE_PCIE_DEVICE.
> 
> Cc: Keith Busch 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> Cc: Dmitry Fleytman 
> Cc: Jason Wang 
> Cc: "Michael S. Tsirkin" 
> Cc: Marcel Apfelbaum 
> Cc: Paul Burton 
> Cc: Paolo Bonzini 
> Cc: Hannes Reinecke 
> Cc: qemu-block@nongnu.org
> Reviewed-by: Alistair Francis 
> Signed-off-by: Eduardo Habkost 

Reviewed-by: David Gibson 

> ---
> Changes v1 -> v2:
> * base-xhci is marked as hybrid, now (in another patch)
> * Included pcie-pci-bridge
> ---
>  hw/block/nvme.c| 4 
>  hw/net/e1000e.c| 4 
>  hw/pci-bridge/pcie_pci_bridge.c| 1 +
>  hw/pci-bridge/pcie_root_port.c | 4 
>  hw/pci-bridge/xio3130_downstream.c | 4 
>  hw/pci-bridge/xio3130_upstream.c   | 4 
>  hw/pci-host/xilinx-pcie.c  | 4 
>  hw/scsi/megasas.c  | 6 ++
>  8 files changed, 31 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 9aa32692a3..441e21ed1f 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1110,6 +1110,10 @@ static const TypeInfo nvme_info = {
>  .instance_size = sizeof(NvmeCtrl),
>  .class_init= nvme_class_init,
>  .instance_init = nvme_instance_init,
> +.interfaces = (InterfaceInfo[]) {
> +{ INTERFACE_PCIE_DEVICE },
> +{ }
> +},
>  };
>  
>  static void nvme_register_types(void)
> diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
> index 6c42b4478c..81f7934a59 100644
> --- a/hw/net/e1000e.c
> +++ b/hw/net/e1000e.c
> @@ -708,6 +708,10 @@ static const TypeInfo e1000e_info = {
>  .instance_size = sizeof(E1000EState),
>  .class_init = e1000e_class_init,
>  .instance_init = e1000e_instance_init,
> +.interfaces = (InterfaceInfo[]) {
> +{ INTERFACE_PCIE_DEVICE },
> +{ }
> +},
>  };
>  
>  static void e1000e_register_types(void)
> diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c
> index 9aa5cc3e45..88db143633 100644
> --- a/hw/pci-bridge/pcie_pci_bridge.c
> +++ b/hw/pci-bridge/pcie_pci_bridge.c
> @@ -180,6 +180,7 @@ static const TypeInfo pcie_pci_bridge_info = {
>  .class_init = pcie_pci_bridge_class_init,
>  .interfaces = (InterfaceInfo[]) {
>  { TYPE_HOTPLUG_HANDLER },
> +{ INTERFACE_PCIE_DEVICE },
>  { },
>  }
>  };
> diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
> index 4d588cb22e..9b6e4ce512 100644
> --- a/hw/pci-bridge/pcie_root_port.c
> +++ b/hw/pci-bridge/pcie_root_port.c
> @@ -161,6 +161,10 @@ static const TypeInfo rp_info = {
>  .class_init= rp_class_init,
>  .abstract  = true,
>  .class_size = sizeof(PCIERootPortClass),
> +.interfaces = (InterfaceInfo[]) {
> +{ INTERFACE_PCIE_DEVICE },
> +{ }
> +},
>  };
>  
>  static void rp_register_types(void)
> diff --git a/hw/pci-bridge/xio3130_downstream.c 
> b/hw/pci-bridge/xio3130_downstream.c
> index e706f36cb7..7d2f7629c1 100644
> --- a/hw/pci-bridge/xio3130_downstream.c
> +++ b/hw/pci-bridge/xio3130_downstream.c
> @@ -195,6 +195,10 @@ static const TypeInfo xio3130_downstream_info = {
>  .name  = "xio3130-downstream",
>  .parent= TYPE_PCIE_SLOT,
>  .class_init= xio3130_downstream_class_init,
> +.interfaces = (InterfaceInfo[]) {
> +{ INTERFACE_PCIE_DEVICE },
> +{ }
> +},
>  };
>  
>  static void xio3130_downstream_register_types(void)
> diff --git a/hw/pci-bridge/xio3130_upstream.c 
> b/hw/pci-bridge/xio3130_upstream.c
> index a052224bbf..227997ce46 100644
> --- a/hw/pci-bridge/xio3130_upstream.c
> +++ b/hw/pci-bridge/xio3130_upstream.c
> @@ -166,6 +166,10 @@ static const TypeInfo xio3130_upstream_info = {
>  .name  = "x3130-upstream",
>  .parent= TYPE_PCIE_PORT,
>  .class_init= xio3130_upstream_class_init,
> +.interfaces = (InterfaceInfo[]) {
> +{ INTERFACE_PCIE_DEVICE },
> +{ }
> +},
>  };
>  
>  static void xio3130_upstream_register_types(void)
> diff --git a/hw/pci-host/xilinx-pcie.c b/hw/pci-host/xilinx-pcie.c
> index 4613dda1d2..7659253090 100644
> --- a/hw/pci-host/xilinx-pcie.c
> +++ b/hw/pci-host/xilinx-pcie.c
> @@ -317,6 +317,10 @@ static const TypeInfo xilinx_pcie_root_info = {
>  .parent = TYPE_PCI_BRIDGE,
>  .instance_size = sizeof(XilinxPCIERoot),
>  .class_init = xilinx_pcie_root_class_init,
> +.interfaces = (InterfaceInfo[]) {
> +{ INTERFACE_PCIE_DEVICE },
> +{ }
> +},
>  };
>  
>  static void xilinx_pcie_register(void)
> diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
> index 0db68aacee..535ee267c3 100644
> --- a/hw/scsi/megasas.c
> +++ b/hw/scsi/megasas.c
> @@ -2451,6 +2451,7 @@ typedef struct MegasasInfo {
>  int osts;
>  const VMStateDescription *vmsd;
>  Property *props;
> +InterfaceInfo *in

Re: [Qemu-block] [PATCH v2 4/5] pci: Add INTERFACE_CONVENTIONAL_PCI_DEVICE to Conventional PCI devices

2017-09-27 Thread David Gibson
On Wed, Sep 27, 2017 at 04:56:34PM -0300, Eduardo Habkost wrote:
> Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of
> TYPE_PCI_DEVICE, except:
> 
> 1) The ones that already have INTERFACE_PCIE_DEVICE set:
> 
> * base-xhci
> * e1000e
> * nvme
> * pvscsi
> * vfio-pci
> * virtio-pci
> * vmxnet3
> 
> 2) base-pci-bridge
> 
> Not all PCI bridges are Conventional PCI devices, so
> INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes
> that are actually Conventional PCI:
> 
> * dec-21154-p2p-bridge
> * i82801b11-bridge
> * pbm-bridge
> * pci-bridge
> 
> The direct subtypes of base-pci-bridge not touched by this patch
> are:
> 
> * xilinx-pcie-root: Already marked as PCIe-only.
> * pcie-pci-bridge: Already marked as PCIe-only.
> * pcie-port: all non-abstract subtypes of pcie-port are already
>   marked as PCIe-only devices.
> 
> 3) megasas-base
> 
> Not all megasas devices are Conventional PCI devices, so the
> interface names are added to the subclasses registered by
> megasas_register_types(), according to information in the
> megasas_devices[] array.
> 
> "megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add
> INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas".
> 
> Acked-by: Alberto Garcia 
> Acked-by: John Snow 
> Acked-by: Anthony PERARD 
> Signed-off-by: Eduardo Habkost 

Reviewed-by: David Gibson 

and for the ppc devices

Acked-by: David Gibson 

> ---
> Changes v1 -> v2:
> * s/legacy/conventional/
>   * Suggested-by: Alex Williamson 
> * Note about pcie-pci-bridge on commit message.
> * New devices: sungem, sunhme
> 
> Cc: "Michael S. Tsirkin" 
> Cc: Igor Mammedov 
> Cc: Gerd Hoffmann 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: Stefano Stabellini 
> Cc: Anthony Perard 
> Cc: John Snow 
> Cc: Alberto Garcia 
> Cc: Aurelien Jarno 
> Cc: Yongbok Kim 
> Cc: Jiri Slaby 
> Cc: Alexander Graf 
> Cc: Marcel Apfelbaum 
> Cc: Jason Wang 
> Cc: Jiri Pirko 
> Cc: "Hervé Poussineau" 
> Cc: Peter Maydell 
> Cc: David Gibson 
> Cc: Hannes Reinecke 
> Cc: Mark Cave-Ayland 
> Cc: Artyom Tarasenko 
> Cc: Alex Williamson 
> Cc: qemu-de...@nongnu.org
> Cc: xen-de...@lists.xenproject.org
> Cc: qemu-block@nongnu.org
> Cc: qemu-...@nongnu.org
> Cc: qemu-...@nongnu.org
> ---
>  hw/acpi/piix4.c |  1 +
>  hw/audio/ac97.c |  4 
>  hw/audio/es1370.c   |  4 
>  hw/audio/intel-hda.c|  4 
>  hw/char/serial-pci.c| 12 
>  hw/display/cirrus_vga.c |  4 
>  hw/display/qxl.c|  4 
>  hw/display/sm501.c  |  4 
>  hw/display/vga-pci.c|  4 
>  hw/display/vmware_vga.c |  4 
>  hw/i2c/smbus_ich9.c |  4 
>  hw/i386/amd_iommu.c |  4 
>  hw/i386/kvm/pci-assign.c|  4 
>  hw/i386/pc_piix.c   |  4 
>  hw/i386/xen/xen_platform.c  |  4 
>  hw/i386/xen/xen_pvdevice.c  |  4 
>  hw/ide/ich.c|  4 
>  hw/ide/pci.c|  4 
>  hw/ipack/tpci200.c  |  4 
>  hw/isa/i82378.c |  4 
>  hw/isa/lpc_ich9.c   |  1 +
>  hw/isa/piix4.c  |  4 
>  hw/isa/vt82c686.c   | 16 
>  hw/mips/gt64xxx_pci.c   |  4 
>  hw/misc/edu.c   |  5 +
>  hw/misc/ivshmem.c   |  4 
>  hw/misc/macio/macio.c   |  4 
>  hw/misc/pci-testdev.c   |  4 
>  hw/net/e1000.c  |  4 
>  hw/net/eepro100.c   |  4 
>  hw/net/ne2000.c |  4 
>  hw/net/pcnet-pci.c  |  4 
>  hw/net/rocker/rocker.c  |  4 
>  hw/net/rtl8139.c|  4 
>  hw/net/sungem.c |  4 
>  hw/net/sunhme.c |  4 
>  hw/pci-bridge/dec.c |  8 
>  hw/pci-bridge/i82801b11.c   |  4 
>  hw/pci-bridge/pci_bridge_dev.c  |  1 +
>  hw/pci-bridge/pci_expander_bridge.c |  8 
>  hw/pci-host/apb.c   |  8 
>  hw/pci-host/bonito.c|  4 
>  hw/pci-host/gpex.c  |  4 
>  hw/pci-host/grackle.c   |  4 
>  hw/pci-host/piix.c  |  8 
>  hw/pci-host/ppce500.c   |  4 
>  hw/pci-host/prep.c  |  4 
>  hw/pci-host/q35.c   |  4 
>  hw/pci-host/uninorth.c  | 16 
>  hw/pci-host/versatile.c |  4 
>  hw/ppc/ppc4xx_pci.c |  4 
>  hw/scsi/esp-pci.c   |  4 
>  hw/scsi/lsi53c895a.c|  4 
>  hw/scsi/megasas.c   |  4 
>  hw/scsi/mptsas.c|  4 
>  h

Re: [Qemu-block] [Qemu-arm] [Qemu-devel] [PATCH v1 1/8] Replace all occurances of __FUNCTION__ with __func__

2017-09-27 Thread Peter Maydell
On 26 September 2017 at 06:32, Eric Blake  wrote:
> On 09/25/2017 07:08 PM, Alistair Francis wrote:
>> diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
>> index 58005b6619..32687afced 100644
>> --- a/hw/arm/nseries.c
>> +++ b/hw/arm/nseries.c
>> @@ -463,7 +463,7 @@ static uint32_t mipid_txrx(void *opaque, uint32_t cmd, 
>> int len)
>>  uint8_t ret;
>>
>>  if (len > 9) {
>> -hw_error("%s: FIXME: bad SPI word width %i\n", __FUNCTION__, len);
>> +hw_error("%s: FIXME: bad SPI word width %i\n", __func__, len);
>
> Not this patch's problem, but it would probably be simpler if hw_error()
> were a macro that automatically prefixed __func__, rather than making
> every caller have to supply it themselves.

I'm not sure there's a great deal of benefit to that change, because
use of hw_error() in new code is rarely correct (it does an abort()
so it should never be used for guest-triggered conditions, which is
about the only time that you might be interested in a guest register
dump rather than just asserting). Most of its existing uses are in
crufty old device models.

thanks
-- PMM



Re: [Qemu-block] [Qemu-devel] [PATCH v1 2/8] tests: Replace fprintf(stderr, "*\n" with error_report()

2017-09-27 Thread Alistair Francis
On Mon, Sep 25, 2017 at 5:55 PM, Emilio G. Cota  wrote:
> On Mon, Sep 25, 2017 at 17:08:48 -0700, Alistair Francis wrote:
>> diff --git a/tests/atomic_add-bench.c b/tests/atomic_add-bench.c
>> index caa1e8e689..41ba1600c0 100644
>> --- a/tests/atomic_add-bench.c
>> +++ b/tests/atomic_add-bench.c
>> @@ -29,7 +29,7 @@ static const char commands_string[] =
>>  static void usage_complete(char *argv[])
>>  {
>>  fprintf(stderr, "Usage: %s [options]\n", argv[0]);
>> -fprintf(stderr, "options:\n%s\n", commands_string);
>> +fprintf(stderr, "options:\n%s", commands_string);
>>  }
>
> We do want that trailing \n, unless we move it to commands_string.

Yeah, this was a mistake. It should have swapped to error_report().

>
> Also, I think using error_report here would be confusing -- this is a 
> standalone
> test program with as little QEMU-specific knowledge as possible (QemuThreads
> are used for portability); using error_report here is confusing (this is not
> an error).

Do you mean in all tests/ sub directory or just a few certain cases?

>
>> diff --git a/tests/check-qlit b/tests/check-qlit
>> new file mode 100755
>> index 
>> ..950429524e3eb07e6daed1fe01caad0f5d554809
>> GIT binary patch
>> literal 272776
>> zcmeEvdwf*Ywf~vPB$){zGeCghJ;($So{10*LNEgfoIs*MKvBRDLV(l&F`3b5QKOS6
>
> ? I don't know what this is, I don't seem to have this binary in my
> checked out tree.

Yeah, this shouldn't be here.

Thanks,
Alistair

>
> (snips thousands of lines)
>
>> diff --git a/tests/qht-bench.c b/tests/qht-bench.c
>> index 11c1cec766..2637d601a9 100644
>> --- a/tests/qht-bench.c
>> +++ b/tests/qht-bench.c
>> @@ -5,6 +5,7 @@
>>   *   See the COPYING file in the top-level directory.
>>   */
>>  #include "qemu/osdep.h"
>> +#include "qemu/error-report.h"
>>  #include "qemu/processor.h"
>>  #include "qemu/atomic.h"
>>  #include "qemu/qht.h"
>> @@ -89,7 +90,7 @@ static const char commands_string[] =
>>  static void usage_complete(int argc, char *argv[])
>>  {
>>  fprintf(stderr, "Usage: %s [options]\n", argv[0]);
>> -fprintf(stderr, "options:\n%s\n", commands_string);
>> +fprintf(stderr, "options:\n%s", commands_string);
>
> Same as above: this removes the necessary trailing \n.
>
>>  exit(-1);
>>  }
>>
>> @@ -328,7 +329,7 @@ static void htable_init(void)
>>  retries++;
>>  }
>>  }
>> -fprintf(stderr, " populated after %zu retries\n", retries);
>> +error_report(" populated after %zu retries", retries);
>>  }
>
> ditto -- I'd rather keep fprintf(stderr) here, it's less confusing.
>
> Thanks,
>
> Emilio



Re: [Qemu-block] [Qemu-devel] [PATCH v2] virtio: introduce `info virtio' hmp command

2017-09-27 Thread Jan Dakinevich

> OK but if it's useful as an hmp command, why not as a qmp command?
>

The command is designed for debugging and produces quite sightly output. 
For respective qmp command most of `info virtio' output would excessive 
and unneccesary.


On 09/27/2017 10:43 PM, Michael S. Tsirkin wrote:

On Wed, Sep 27, 2017 at 06:09:42PM +0300, Jan Dakinevich wrote:

The command is intended for exposing device specific virtio feature bits
and their negotiation status. It is convenient and useful for debug
purpose.

Names of features are taken from a devices via get_feature_name() within
VirtioDeviceClass. If certain device doesn't implement it, the command
will print only hexadecimal value of feature mask.

Cc: Denis V. Lunev 
Signed-off-by: Jan Dakinevich 


OK but if it's useful as an hmp command, why not as a qmp command?


---
v2:
* Moved hmp_info_virtio and stuff to hmp.c to avoid build error
* Added printing of device status
* Listed common virtio features

v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg07247.html
---
  hmp-commands-info.hx|  14 +
  hmp.c   | 148 
  hmp.h   |   1 +
  hw/block/virtio-blk.c   |  20 ++
  hw/char/virtio-serial-bus.c |  15 +
  hw/display/virtio-gpu.c |  12 
  hw/net/virtio-net.c |  34 ++
  hw/scsi/virtio-scsi.c   |  15 +
  hw/virtio/virtio-balloon.c  |  15 +
  hw/virtio/virtio-bus.c  |   1 +
  include/hw/virtio/virtio.h  |   2 +
  monitor.c   |   1 +
  12 files changed, 278 insertions(+)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 4f1ece9..2550027 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -868,6 +868,20 @@ ETEXI
  },
  
  STEXI

+@item info virtio
+@findex virtio
+Display guest and host fetures for all virtio devices.
+ETEXI
+
+{
+.name   = "virtio",
+.args_type  = "",
+.params = "",
+.help   = "show virtio info",
+.cmd= hmp_info_virtio,
+},
+
+STEXI
  @end table
  ETEXI
  
diff --git a/hmp.c b/hmp.c

index ace729d..009d808 100644
--- a/hmp.c
+++ b/hmp.c
@@ -43,6 +43,7 @@
  #include "hw/intc/intc.h"
  #include "migration/snapshot.h"
  #include "migration/misc.h"
+#include "hw/virtio/virtio.h"
  
  #ifdef CONFIG_SPICE

  #include 
@@ -2894,3 +2895,150 @@ void hmp_info_memory_size_summary(Monitor *mon, const 
QDict *qdict)
  }
  hmp_handle_error(mon, &err);
  }
+
+#define HMP_INFO_VIRTIO_INDENT 2
+#define HMP_INFO_VIRTIO_FIELD 32
+
+static void hmp_info_virtio_print_status(Monitor *mon, VirtIODevice *vdev)
+{
+uint8_t status[] = {
+VIRTIO_CONFIG_S_ACKNOWLEDGE,
+VIRTIO_CONFIG_S_DRIVER,
+VIRTIO_CONFIG_S_DRIVER_OK,
+VIRTIO_CONFIG_S_FEATURES_OK,
+VIRTIO_CONFIG_S_NEEDS_RESET,
+VIRTIO_CONFIG_S_FAILED,
+};
+const char *names[] = {
+"acknowledge",
+"driver",
+"driver_ok",
+"features_ok",
+"needs_reset"
+"failed",
+};
+const char *comma = "";
+int idx;
+
+monitor_printf(mon, "%*sstatus:   0x%02"PRIx8" ",
+   HMP_INFO_VIRTIO_INDENT, "", vdev->status);
+
+for (idx = 0; idx < ARRAY_SIZE(status); idx++) {
+if (!(vdev->status & status[idx])) {
+continue;
+}
+monitor_printf(mon, "%s%s", comma, names[idx]);
+if (names[idx]) {
+comma = ",";
+}
+}
+monitor_printf(mon, "\n");
+}
+
+static void hmp_info_virtio_print_common_features(Monitor *mon,
+  VirtIODevice *vdev)
+{
+uint64_t features[] = {
+VIRTIO_RING_F_INDIRECT_DESC,
+VIRTIO_RING_F_EVENT_IDX,
+VIRTIO_F_NOTIFY_ON_EMPTY,
+VIRTIO_F_ANY_LAYOUT,
+VIRTIO_F_IOMMU_PLATFORM,
+VIRTIO_F_VERSION_1,
+};
+const char *names[] = {
+"indirect_desc",
+"event_idx",
+"notify_on_empty",
+"any_layout",
+"iommu_platform",
+"version_1",
+};
+int idx;
+
+monitor_printf(mon, "%*scommon features:\n", HMP_INFO_VIRTIO_INDENT, "");
+
+for (idx = 0; idx < ARRAY_SIZE(features); idx++) {
+const char *ack = vdev->guest_features & features[idx] ? "acked" : "";
+
+if (!(vdev->host_features & features[idx])) {
+continue;
+}
+monitor_printf(mon, "%*s%*s%*s\n", HMP_INFO_VIRTIO_INDENT, "",
+   HMP_INFO_VIRTIO_FIELD, names[idx],
+   HMP_INFO_VIRTIO_FIELD, ack);
+}
+}
+
+static void hmp_info_virtio_print_specific_features(Monitor *mon,
+VirtIODevice *vdev)
+{
+VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+int idx;
+
+if (!vdc->get_feature_name) {
+return;
+}
+
+monitor_printf(mon, "%*sspecific features:\n", HMP_INFO_VIRTIO_INDENT, 

Re: [Qemu-block] [Qemu-devel] [PATCH v1 1/8] Replace all occurances of __FUNCTION__ with __func__

2017-09-27 Thread Alistair Francis
On Tue, Sep 26, 2017 at 6:32 AM, Eric Blake  wrote:
> On 09/25/2017 07:08 PM, Alistair Francis wrote:
>> Replace all occurs of __FUNCTION__ except for the check in checkpatch
>> with the non GCC specific __func__.
>>
>> One line in hcd-musb.c was manually tweaked to pass checkpatch.
>>
>> Signed-off-by: Alistair Francis 
>> Cc: Gerd Hoffmann 
>> Cc: Andrzej Zaborowski 
>> Cc: Stefano Stabellini 
>> Cc: Anthony Perard 
>> Cc: John Snow 
>> Cc: Aurelien Jarno 
>> Cc: Yongbok Kim 
>> Cc: Peter Crosthwaite 
>> Cc: Stefan Hajnoczi  (supporter:Block
>> Cc: Fam Zheng  (supporter:Block
>
> That looks funny, with no closing ).  Something's breaking down between
> get_maintainers.pl and your eventual email, although it's not fatal.

Yeah, that's a copy and paste error, will fix.

>
>> Cc: Juan Quintela 
>> Cc: "Dr. David Alan Gilbert" 
>> Cc: qemu-...@nongnu.org
>> Cc: qemu-block@nongnu.org
>> Cc: xen-de...@lists.xenproject.org
>> ---
>>
>
>>  65 files changed, 273 insertions(+), 273 deletions(-)
>
> Big but mechanical, so I'm okay without splitting it further.

Phew! I did not want to split it.

>
>>
>> diff --git a/audio/audio_int.h b/audio/audio_int.h
>> index 5bcb1c60e1..543b1bd8d5 100644
>> --- a/audio/audio_int.h
>> +++ b/audio/audio_int.h
>> @@ -253,7 +253,7 @@ static inline int audio_ring_dist (int dst, int src, int 
>> len)
>>  #define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)
>>
>>  #if defined _MSC_VER || defined __GNUC__
>> -#define AUDIO_FUNC __FUNCTION__
>> +#define AUDIO_FUNC __func__
>>  #else
>>  #define AUDIO_FUNC __FILE__ ":" AUDIO_STRINGIFY (__LINE__)
>>  #endif
>
> This can be further simplified.  We really aren't using _MSC_VER as our
> compiler (can anyone prove me wrong?), and we DO require a C99 compiler
> (per C99 6.4.2.2, __func__ support is mandatory), so we don't really
> need the #else branch (or, for that matter, we probably don't even need
> AUDIO_FUNC).  But to keep this patch mechanical, that can be a separate
> followup.

I have a second patch that removes AUDIO_FUNC

>
>> diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
>> index 58005b6619..32687afced 100644
>> --- a/hw/arm/nseries.c
>> +++ b/hw/arm/nseries.c
>> @@ -463,7 +463,7 @@ static uint32_t mipid_txrx(void *opaque, uint32_t cmd, 
>> int len)
>>  uint8_t ret;
>>
>>  if (len > 9) {
>> -hw_error("%s: FIXME: bad SPI word width %i\n", __FUNCTION__, len);
>> +hw_error("%s: FIXME: bad SPI word width %i\n", __func__, len);
>
> Not this patch's problem, but it would probably be simpler if hw_error()
> were a macro that automatically prefixed __func__, rather than making
> every caller have to supply it themselves.

I'm going to leave this for another day, but I think you are right.

>
>> +++ b/hw/arm/omap1.c
>
>> @@ -1716,7 +1716,7 @@ static void omap_clkm_write(void *opaque, hwaddr addr,
>>  case 0x18:   /* ARM_SYSST */
>>  if ((s->clkm.clocking_scheme ^ (value >> 11)) & 7) {
>>  s->clkm.clocking_scheme = (value >> 11) & 7;
>> -printf("%s: clocking scheme set to %s\n", __FUNCTION__,
>> +printf("%s: clocking scheme set to %s\n", __func__,
>>  clkschemename[s->clkm.clocking_scheme]);
>
> Worth fixing the indentation while you are here?

Fixed

>
>> @@ -2473,7 +2473,7 @@ static void omap_pwt_write(void *opaque, hwaddr addr,
>>  case 0x04:   /* VRC */
>>  if ((value ^ s->vrc) & 1) {
>>  if (value & 1)
>> -printf("%s: %iHz buzz on\n", __FUNCTION__, (int)
>> +printf("%s: %iHz buzz on\n", __func__, (int)
>>  /* 1.5 MHz from a 12-MHz or 13-MHz PWT_CLK 
>> */
>>  ((omap_clk_getrate(s->clk) >> 3) /
>>   /* Pre-multiplexer divider */
>
> Likewise?

This doesn't actually change the indention. It's all one argument to a function.

>
>> @@ -3330,13 +3330,13 @@ static void omap_mcbsp_writeh(void *opaque, hwaddr 
>> addr,
>>  s->mcr[1] = value & 0x03e3;
>>  if (value & 3)   /* XMCM */
>>  printf("%s: Tx channel selection mode enable attempt\n",
>> -__FUNCTION__);
>> +__func__);
>>  return;
>>  case 0x1a:   /* MCR1 */
>>  s->mcr[0] = value & 0x03e1;
>>  if (value & 1)   /* RMCM */
>>  printf("%s: Rx channel selection mode enable attempt\n",
>> -__FUNCTION__);
>> +__func__);
>
> and again

Fixed.

>
>
>> +++ b/hw/arm/omap2.c
>> @@ -1312,7 +1312,7 @@ static void omap_prcm_apll_update(struct omap_prcm_s 
>> *s)
>>
>>  if (mode[0] == 1 || mode[0] == 2 || mode[1] == 1 || mode[1] == 2)
>>  fprintf(stderr, "%s: bad EN_54M_PLL or bad EN_96M_PLL\n",
>> -__FUNCTION__);
>> +__func__);
>
> More of 

Re: [Qemu-block] [Qemu-devel] [PATCH v1 3/8] hw: Replace fprintf(stderr, "*\n" with error_report()

2017-09-27 Thread Alistair Francis
On Mon, Sep 25, 2017 at 8:51 PM, Thomas Huth  wrote:
> On 26.09.2017 02:08, Alistair Francis wrote:
>> Replace a large number of the fprintf(stderr, "*\n" calls with
>> error_report(). The functions were renamed with these commands and then
>> compiler issues where manually fixed.
>>
>> find ./* -type f -exec sed -i \
>> 'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N;N;N;N;N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N;N;N;N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N;N;N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N;N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N;N; {s|fprintf(stderr, 
>> "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>> find ./* -type f -exec sed -i \
>> 'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
>> {} +
>>
>> Some lines where then manually tweaked to pass checkpatch.
>>
>> Signed-off-by: Alistair Francis 
>> Cc: Andrzej Zaborowski 
>> Cc: Jan Kiszka 
>> Cc: Stefan Hajnoczi 
>> Cc: Paolo Bonzini 
>> Cc: Thomas Huth 
>> Cc: Gerd Hoffmann 
>> Cc: "Michael S. Tsirkin" 
>> Cc: Richard Henderson 
>> Cc: Eduardo Habkost 
>> Cc: Stefano Stabellini 
>> Cc: Anthony Perard 
>> Cc: John Snow 
>> Cc: Christian Borntraeger 
>> Cc: Cornelia Huck 
>> Cc: Alexander Graf 
>> Cc: Michael Walle 
>> Cc: Paul Burton 
>> Cc: Aurelien Jarno 
>> Cc: Yongbok Kim 
>> Cc: "Hervé Poussineau" 
>> Cc: Anthony Green 
>> Cc: Jason Wang 
>> Cc: Chris Wulff 
>> Cc: Marek Vasut 
>> Cc: Jia Liu 
>> Cc: Stafford Horne 
>> Cc: Marcel Apfelbaum 
>> Cc: Magnus Damm 
>> Cc: Fabien Chouteau 
>> Cc: Mark Cave-Ayland 
>> Cc: Artyom Tarasenko 
>> Cc: qemu-...@nongnu.org
>> Cc: qemu-block@nongnu.org
>> Cc: xen-de...@lists.xenproject.org
>> Cc: qemu-...@nongnu.org
>> ---
>>
>>  hw/arm/armv7m.c |  2 +-
>>  hw/arm/boot.c   | 34 +--
>>  hw/arm/gumstix.c| 13 
>>  hw/arm/mainstone.c  |  7 ++--
>>  hw/arm/musicpal.c   |  2 +-
>>  hw/arm/omap1.c  |  5 +--
>>  hw/arm/omap2.c  | 21 ++--
>>  hw/arm/omap_sx1.c   |  6 ++--
>>  hw/arm/palm.c   | 10 +++---
>>  hw/arm/pxa2xx.c |  7 ++--
>>  hw/arm/stellaris.c  |  3 +-
>>  hw/arm/tosa.c   | 17 +-
>>  hw/arm/versatilepb.c|  2 +-
>>  hw/arm/vexpress.c   |  8 ++---
>>  hw/arm/z2.c |  6 ++--
>>  hw/block/dataplane/virtio-blk.c |  6 ++--
>>  hw/block/onenand.c  |  8 ++---
>>  hw/block/tc58128.c  | 44 -
>>  hw/bt/core.c| 15 +
>>  hw/bt/hci-csr.c | 17 +-
>>  hw/bt/hci.c | 30 -
>>  hw/bt/hid.c |  2 +-
>>  hw/bt/l2cap.c   | 47 ++-
>>  hw/bt/sdp.c |  7 ++--
>>  hw/char/exynos4210_uart.c   |  6 ++--
>>  hw/char/mcf_uart.c  |  5 +--
>>  hw/char/sh_serial.c |  9 +++---
>>  hw/core/loader.c| 31 +-
>>  hw/core/ptimer.c|  7 ++--
>>  hw/cris/axis_dev88.c|  3 +-
>>  hw/cris/boot.c  |  5 +--
>>  hw/display/blizzard.c   | 20 ++--
>>  hw/display/omap_dss.c   | 14 
>>  hw/display/pl110.c  |  2 +-
>>  hw/display/pxa2xx_lcd.c |  2 +-
>>  hw/display/qxl-render.c |  7 ++--
>>  hw/display/qxl.c| 10 +++---
>>  hw/display/tc6393xb.c   | 36 -
>>  hw/display/virtio-gpu-3d.c  |  4 +--
>>  hw/display/vmware_vga.c | 22 ++---
>>  hw/dma/omap_dma.c   | 26 ---
>>  hw/dma/soc_dma.c| 37 ++---
>>  hw/gpio/omap_gpio.c |  2 +-
>>  hw/i2c/omap_i2c.c   | 10 +++---
>>  hw/i386/kvm/apic.c  |  9 ++

Re: [Qemu-block] [Qemu-devel] [PATCH v4 18/23] qemu-img: Change compare_sectors() to be byte-based

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> In the continuing quest to make more things byte-based, change
> compare_sectors(), renaming it to compare_buffers() in the
> process.  Note that one caller (qemu-img compare) only cares
> about the first difference, while the other (qemu-img rebase)
> cares about how many consecutive sectors have the same
> equal/different status; however, this patch does not bother to
> micro-optimize the compare case to avoid the comparisons of
> sectors beyond the first mismatch.  Both callers are always
> passing valid buffers in, so the initial check for buffer size
> can be turned into an assertion.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v3: new patch
> ---
>  qemu-img.c | 55 +++
>  1 file changed, 27 insertions(+), 28 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 2e05f92e85..034122eba5 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -1155,31 +1155,28 @@ static int is_allocated_sectors_min(const uint8_t 
> *buf, int n, int *pnum,
>  }
> 
>  /*
> - * Compares two buffers sector by sector. Returns 0 if the first sector of 
> both
> - * buffers matches, non-zero otherwise.
> + * Compares two buffers sector by sector. Returns 0 if the first
> + * sector of each buffer matches, non-zero otherwise.
>   *
> - * pnum is set to the number of sectors (including and immediately following
> - * the first one) that are known to have the same comparison result
> + * pnum is set to the sector-aligned size of the buffer prefix that
> + * has the same matching status as the first sector.
>   */
> -static int compare_sectors(const uint8_t *buf1, const uint8_t *buf2, int n,
> -int *pnum)
> +static int compare_buffers(const uint8_t *buf1, const uint8_t *buf2,
> +   int64_t bytes, int64_t *pnum)
>  {
>  bool res;
> -int i;
> +int64_t i = MIN(bytes, BDRV_SECTOR_SIZE);
> 
> -if (n <= 0) {
> -*pnum = 0;
> -return 0;
> -}
> +assert(bytes > 0);
> 
> -res = !!memcmp(buf1, buf2, 512);
> -for(i = 1; i < n; i++) {
> -buf1 += 512;
> -buf2 += 512;
> +res = !!memcmp(buf1, buf2, i);

It is temporarily confusing that 'i' is never again used for this
particular parameter, because

> +while (i < bytes) {

This gives the brief impression that we might be looping in a way that
changes the comparison size passed to memcmp, which isn't true.

Just me being cranky, though. It's probably still the best way, because
of how you have to prime the loop. Doing it the literal-minded way
requires an extra i += len, so:

Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH v4 01/23] block: Allow NULL file for bdrv_get_block_status()

2017-09-27 Thread Eric Blake
On 09/25/2017 05:43 PM, John Snow wrote:
> 
> 
> On 09/13/2017 12:03 PM, Eric Blake wrote:
>> Not all callers care about which BDS owns the mapping for a given
>> range of the file.  This patch merely simplifies the callers by
>> consolidating the logic in the common call point, while guaranteeing
>> a non-NULL file to all the driver callbacks, for no semantic change.
>> The only caller that does not care about pnum is bdrv_is_allocated,
>> as invoked by vvfat; we can likewise add assertions that the rest
>> of the stack does not have to worry about a NULL pnum.
>>
>> Furthermore, this will also set the stage for a future cleanup: when
>> a caller does not care about which BDS owns an offset, it would be
>> nice to allow the driver to optimize things to not have to return
>> BDRV_BLOCK_OFFSET_VALID in the first place.  In the case of fragmented
>> allocation (for example, it's fairly easy to create a qcow2 image
>> where consecutive guest addresses are not at consecutive host
>> addresses), the current contract requires bdrv_get_block_status()
>> to clamp *pnum to the limit where host addresses are no longer
>> consecutive, but allowing a NULL file means that *pnum could be
>> set to the full length of known-allocated data.
>>
> 
> Function reads slightly worse for the wear now with all of the return
> logic handled at various places within, but unifying it might be even
> stranger, perhaps..
> 
> Let's see if I hate this more:
> 
> out:
> bdrv_dec_in_flight(bs);
> bdrv_dec_in_flight(bs);
> if (ret >= 0 && sector_num + *pnum == total_sectors) {
> ret |= BDRV_BLOCK_EOF;
> }
> early_out:
> if (file) {
> *file = local_file;
> }
> return ret;
> 
> 
> and then earlier in the function, we can just:
> 
> if (total_sectors < 0) {
>   ret = total_sectors;
>   goto early_out;
> }

Seems reasonable enough, I'll work that in to v5, since there are other
reasons to respin the series anyway.

> 
> It's only shed paint, though:
> 
> Reviewed-by: John Snow 
> 
> I'm looking at the rest of the series now, so please stand by.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v4 17/23] qemu-img: Change check_empty_sectors() to byte-based

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> Continue on the quest to make more things byte-based instead of
> sector-based.
> 
> Signed-off-by: Eric Blake 


Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH v4 16/23] qemu-img: Drop redundant error message in compare

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> If a read error is encountered during 'qemu-img compare', we
> were printing the "Error while reading offset ..." message twice.
> Update the testsuite for the improved output.
> 
> Further simplify the code by hoisting the error code conversion
> into the helper function, rather than repeating it at the callers.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v3: new patch
> ---
>  qemu-img.c | 19 +--
>  tests/qemu-iotests/074.out |  2 --
>  2 files changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index dfccebe6bc..3e1e373e8f 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -1196,8 +1196,10 @@ static int64_t sectors_to_bytes(int64_t sectors)
>  /*
>   * Check if passed sectors are empty (not allocated or contain only 0 bytes)
>   *
> - * Returns 0 in case sectors are filled with 0, 1 if sectors contain non-zero
> - * data and negative value on error.
> + * Intended for use by 'qemu-img compare': Returns 0 in case sectors are
> + * filled with 0, 1 if sectors contain non-zero data (this is a comparison
> + * failure), and 4 on error (the exit status for read errors), after emitting
> + * an error message.
>   *
>   * @param blk:  BlockBackend for the image
>   * @param sect_num: Number of first sector to check
> @@ -1218,7 +1220,7 @@ static int check_empty_sectors(BlockBackend *blk, 
> int64_t sect_num,
>  if (ret < 0) {
>  error_report("Error while reading offset %" PRId64 " of %s: %s",
>   sectors_to_bytes(sect_num), filename, strerror(-ret));
> -return ret;
> +return 4;
>  }
>  idx = find_nonzero(buffer, sect_count * BDRV_SECTOR_SIZE);
>  if (idx >= 0) {
> @@ -1473,11 +1475,6 @@ static int img_compare(int argc, char **argv)
>filename2, buf1, quiet);
>  }
>  if (ret) {
> -if (ret < 0) {
> -error_report("Error while reading offset %" PRId64 ": 
> %s",
> - sectors_to_bytes(sector_num), 
> strerror(-ret));
> -ret = 4;
> -}
>  goto out;
>  }
>  }
> @@ -1522,12 +1519,6 @@ static int img_compare(int argc, char **argv)
>  ret = check_empty_sectors(blk_over, sector_num, nb_sectors,
>filename_over, buf1, quiet);
>  if (ret) {
> -if (ret < 0) {
> -error_report("Error while reading offset %" PRId64
> - " of %s: %s", 
> sectors_to_bytes(sector_num),
> - filename_over, strerror(-ret));
> -ret = 4;
> -}
>  goto out;
>  }
>  }
> diff --git a/tests/qemu-iotests/074.out b/tests/qemu-iotests/074.out
> index 8fba5aea9c..ede66c3f81 100644
> --- a/tests/qemu-iotests/074.out
> +++ b/tests/qemu-iotests/074.out
> @@ -4,7 +4,6 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
>  wrote 512/512 bytes at offset 512
>  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  qemu-img: Error while reading offset 0 of 
> blkdebug:TEST_DIR/blkdebug.conf:TEST_DIR/t.IMGFMT: Input/output error
> -qemu-img: Error while reading offset 0: Input/output error
>  4
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
>  Formatting 'TEST_DIR/t.IMGFMT.2', fmt=IMGFMT size=0
> @@ -12,7 +11,6 @@ Formatting 'TEST_DIR/t.IMGFMT.2', fmt=IMGFMT size=0
>  wrote 512/512 bytes at offset 512
>  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  qemu-img: Error while reading offset 0 of 
> blkdebug:TEST_DIR/blkdebug.conf:TEST_DIR/t.IMGFMT: Input/output error
> -qemu-img: Error while reading offset 0 of 
> blkdebug:TEST_DIR/blkdebug.conf:TEST_DIR/t.IMGFMT: Input/output error
>  Warning: Image size mismatch!
>  4
>  Cleanup
> 

Hm, naively I might assume it's best for the caller to report the error
and to leave the function a nicely self-contained helper, but I won't
insist on it.

Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: Fix 195 if IMGFMT is part of TEST_DIR

2017-09-27 Thread Eric Blake
On 09/27/2017 04:13 PM, Max Reitz wrote:
> do_run_qemu() in iotest 195 first applies _filter_imgfmt when printing
> qemu's command line and _filter_testdir only afterwards.  Therefore, if
> the image format is part of the test directory path, _filter_testdir
> will no longer apply and the actual output will differ from the
> reference output even in case of success.
> 
> For example, TEST_DIR might be "/tmp/test-qcow2", in which case
> _filter_imgfmt first transforms this to "/tmp/test-IMGFMT" which is no
> longer recognized as the TEST_DIR by _filter_testdir.
> 
> Fix this by not applying _filter_imgfmt in do_run_qemu() but in
> run_qemu() instead, and only after _filter_testdir.
> 
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/195 | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v4 15/23] qemu-img: Add find_nonzero()

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> During 'qemu-img compare', when we are checking that an allocated
> portion of one file is all zeros, we don't need to waste time
> computing how many additional sectors after the first non-zero
> byte are also non-zero.  Create a new helper find_nonzero() to do
> the check for a first non-zero sector, and rebase
> check_empty_sectors() to use it.
> 
> The new interface intentionally uses bytes in its interface, even
> though it still crawls the buffer a sector at a time; it is robust
> to a partial sector at the end of the buffer.
> 
> Signed-off-by: Eric Blake 

Reviewed-by: John Snow 



[Qemu-block] [PATCH] iotests: Fix 195 if IMGFMT is part of TEST_DIR

2017-09-27 Thread Max Reitz
do_run_qemu() in iotest 195 first applies _filter_imgfmt when printing
qemu's command line and _filter_testdir only afterwards.  Therefore, if
the image format is part of the test directory path, _filter_testdir
will no longer apply and the actual output will differ from the
reference output even in case of success.

For example, TEST_DIR might be "/tmp/test-qcow2", in which case
_filter_imgfmt first transforms this to "/tmp/test-IMGFMT" which is no
longer recognized as the TEST_DIR by _filter_testdir.

Fix this by not applying _filter_imgfmt in do_run_qemu() but in
run_qemu() instead, and only after _filter_testdir.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/195 | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/195 b/tests/qemu-iotests/195
index 05a239cbf5..e7a403ded2 100755
--- a/tests/qemu-iotests/195
+++ b/tests/qemu-iotests/195
@@ -44,15 +44,16 @@ _supported_os Linux
 
 function do_run_qemu()
 {
-echo Testing: "$@" | _filter_imgfmt
+echo Testing: "$@"
 $QEMU -nographic -qmp-pretty stdio -serial none "$@"
 echo
 }
 
 function run_qemu()
 {
-do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qemu | _filter_qmp \
-  | _filter_qemu_io | _filter_generated_node_ids
+do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_imgfmt | _filter_qemu \
+  | _filter_qmp | _filter_qemu_io \
+  | _filter_generated_node_ids
 }
 
 size=64M
-- 
2.13.5




Re: [Qemu-block] [PATCH v4 0/6] Misc improvements to crypto block driver

2017-09-27 Thread Max Reitz
On 2017-09-27 14:53, Daniel P. Berrange wrote:
> This is a followup to
> 
>   v1: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00781.html
>   v2: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06464.html
>   v3: https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg02923.html
> 
> This collection of patches first improves the performance of the
> crypto block driver and then does various cleanups to improve ongoing
> maint work.
> 
> Changed in v4:
> 
>   - Drop intermediate patch that replaced '512' with a constant (Max)
>   - Use MIN() macro where needed (Max)
>   - Fix bounce buffer size at 1MB instead of varying per sector size (Max)
>   - Convert missing qcrypto_block_encrypt call to sectors in qcow.c (Max)
> 
> Changed in v3:
> 
>   - Support passthrough of BDRV_REQ_FUA (Eric)
>   - Fix potential truncation of payload offset values (Eric)
>   - Use encryption scheme sector size instead of BDRV_SECTOR_SIZE (Kevin)
>   - Use QEMU_IS_ALIGNED where appropriate (Eric)
>   - Remove unused 'sector_num' variable (Eric)
>   - Fix whitespace alignment (Eric)
>   - Fix math error in qcow conversion (Eric)
> 
> Daniel P. Berrange (6):
>   block: use 1 MB bounce buffers for crypto instead of 16KB
>   crypto: expose encryption sector size in APIs
>   block: fix data type casting for crypto payload offset
>   block: convert crypto driver to bdrv_co_preadv|pwritev
>   block: convert qcrypto_block_encrypt|decrypt to take bytes offset
>   block: support passthrough of BDRV_REQ_FUA in crypto driver
> 
>  block/crypto.c | 130 
> ++---
>  block/qcow.c   |  11 +++--
>  block/qcow2-cluster.c  |   8 ++-
>  block/qcow2.c  |   4 +-
>  crypto/block-luks.c|  18 ---
>  crypto/block-qcow.c|  13 +++--
>  crypto/block.c |  26 +++---
>  crypto/blockpriv.h |   5 +-
>  include/crypto/block.h |  29 ---
>  9 files changed, 148 insertions(+), 96 deletions(-)

Thanks; hoping that is OK with you, I've applied this series to my block
branch:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v4 14/23] qemu-img: Speed up compare on pre-allocated larger file

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> Compare the following images with all-zero contents:
> $ truncate --size 1M A
> $ qemu-img create -f qcow2 -o preallocation=off B 1G
> $ qemu-img create -f qcow2 -o preallocation=metadata C 1G
> 
> On my machine, the difference is noticeable for pre-patch speeds,
> with more than an order of magnitude in difference caused by the
> choice of preallocation in the qcow2 file:
> 
> $ time ./qemu-img compare -f raw -F qcow2 A B
> Warning: Image size mismatch!
> Images are identical.
> 
> real  0m0.014s
> user  0m0.007s
> sys   0m0.007s
> 
> $ time ./qemu-img compare -f raw -F qcow2 A C
> Warning: Image size mismatch!
> Images are identical.
> 
> real  0m0.341s
> user  0m0.144s
> sys   0m0.188s
> 
> Why? Because bdrv_is_allocated() returns false for image B but
> true for image C, throwing away the fact that both images know
> via lseek(SEEK_HOLE) that the entire image still reads as zero.
> From there, qemu-img ends up calling bdrv_pread() for every byte
> of the tail, instead of quickly looking for the next allocation.
> The solution: use block_status instead of is_allocated, giving:
> 
> $ time ./qemu-img compare -f raw -F qcow2 A C
> Warning: Image size mismatch!
> Images are identical.
> 
> real  0m0.014s
> user  0m0.011s
> sys   0m0.003s
> 
> which is on par with the speeds for no pre-allocation.
> 
> Signed-off-by: Eric Blake 

Makes good sense to me.

Reviewed-by: John Snow 



Re: [Qemu-block] [PATCH v4 5/6] block: convert qcrypto_block_encrypt|decrypt to take bytes offset

2017-09-27 Thread Max Reitz
On 2017-09-27 14:53, Daniel P. Berrange wrote:
> Instead of sector offset, take the bytes offset when encrypting
> or decrypting data.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block/crypto.c | 12 
>  block/qcow.c   | 11 +++
>  block/qcow2-cluster.c  |  8 +++-
>  block/qcow2.c  |  4 ++--
>  crypto/block-luks.c| 12 
>  crypto/block-qcow.c| 12 
>  crypto/block.c | 20 ++--
>  crypto/blockpriv.h |  4 ++--
>  include/crypto/block.h | 14 --
>  9 files changed, 56 insertions(+), 41 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 4/6] block: convert crypto driver to bdrv_co_preadv|pwritev

2017-09-27 Thread Max Reitz
On 2017-09-27 14:53, Daniel P. Berrange wrote:
> Make the crypto driver implement the bdrv_co_preadv|pwritev
> callbacks, and also use bdrv_co_preadv|pwritev for I/O
> with the protocol driver beneath. This replaces sector based
> I/O with byte based I/O, and allows us to stop assuming the
> physical sector size matches the encryption sector size.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block/crypto.c | 106 
> +
>  1 file changed, 54 insertions(+), 52 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 1/6] block: use 1 MB bounce buffers for crypto instead of 16KB

2017-09-27 Thread Max Reitz
On 2017-09-27 14:53, Daniel P. Berrange wrote:
> Using 16KB bounce buffers creates a significant performance
> penalty for I/O to encrypted volumes on storage which high
> I/O latency (rotating rust & network drives), because it
> triggers lots of fairly small I/O operations.
> 
> On tests with rotating rust, and cache=none|directsync,
> write speed increased from 2MiB/s to 32MiB/s, on a par
> with that achieved by the in-kernel luks driver. With
> other cache modes the in-kernel driver is still notably
> faster because it is able to report completion of the
> I/O request before any encryption is done, while the
> in-QEMU driver must encrypt the data before completion.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block/crypto.c | 28 +++-
>  1 file changed, 15 insertions(+), 13 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v2 4/5] pci: Add INTERFACE_CONVENTIONAL_PCI_DEVICE to Conventional PCI devices

2017-09-27 Thread Eduardo Habkost
Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of
TYPE_PCI_DEVICE, except:

1) The ones that already have INTERFACE_PCIE_DEVICE set:

* base-xhci
* e1000e
* nvme
* pvscsi
* vfio-pci
* virtio-pci
* vmxnet3

2) base-pci-bridge

Not all PCI bridges are Conventional PCI devices, so
INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes
that are actually Conventional PCI:

* dec-21154-p2p-bridge
* i82801b11-bridge
* pbm-bridge
* pci-bridge

The direct subtypes of base-pci-bridge not touched by this patch
are:

* xilinx-pcie-root: Already marked as PCIe-only.
* pcie-pci-bridge: Already marked as PCIe-only.
* pcie-port: all non-abstract subtypes of pcie-port are already
  marked as PCIe-only devices.

3) megasas-base

Not all megasas devices are Conventional PCI devices, so the
interface names are added to the subclasses registered by
megasas_register_types(), according to information in the
megasas_devices[] array.

"megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add
INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas".

Acked-by: Alberto Garcia 
Acked-by: John Snow 
Acked-by: Anthony PERARD 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* s/legacy/conventional/
  * Suggested-by: Alex Williamson 
* Note about pcie-pci-bridge on commit message.
* New devices: sungem, sunhme

Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Gerd Hoffmann 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: John Snow 
Cc: Alberto Garcia 
Cc: Aurelien Jarno 
Cc: Yongbok Kim 
Cc: Jiri Slaby 
Cc: Alexander Graf 
Cc: Marcel Apfelbaum 
Cc: Jason Wang 
Cc: Jiri Pirko 
Cc: "Hervé Poussineau" 
Cc: Peter Maydell 
Cc: David Gibson 
Cc: Hannes Reinecke 
Cc: Mark Cave-Ayland 
Cc: Artyom Tarasenko 
Cc: Alex Williamson 
Cc: qemu-de...@nongnu.org
Cc: xen-de...@lists.xenproject.org
Cc: qemu-block@nongnu.org
Cc: qemu-...@nongnu.org
Cc: qemu-...@nongnu.org
---
 hw/acpi/piix4.c |  1 +
 hw/audio/ac97.c |  4 
 hw/audio/es1370.c   |  4 
 hw/audio/intel-hda.c|  4 
 hw/char/serial-pci.c| 12 
 hw/display/cirrus_vga.c |  4 
 hw/display/qxl.c|  4 
 hw/display/sm501.c  |  4 
 hw/display/vga-pci.c|  4 
 hw/display/vmware_vga.c |  4 
 hw/i2c/smbus_ich9.c |  4 
 hw/i386/amd_iommu.c |  4 
 hw/i386/kvm/pci-assign.c|  4 
 hw/i386/pc_piix.c   |  4 
 hw/i386/xen/xen_platform.c  |  4 
 hw/i386/xen/xen_pvdevice.c  |  4 
 hw/ide/ich.c|  4 
 hw/ide/pci.c|  4 
 hw/ipack/tpci200.c  |  4 
 hw/isa/i82378.c |  4 
 hw/isa/lpc_ich9.c   |  1 +
 hw/isa/piix4.c  |  4 
 hw/isa/vt82c686.c   | 16 
 hw/mips/gt64xxx_pci.c   |  4 
 hw/misc/edu.c   |  5 +
 hw/misc/ivshmem.c   |  4 
 hw/misc/macio/macio.c   |  4 
 hw/misc/pci-testdev.c   |  4 
 hw/net/e1000.c  |  4 
 hw/net/eepro100.c   |  4 
 hw/net/ne2000.c |  4 
 hw/net/pcnet-pci.c  |  4 
 hw/net/rocker/rocker.c  |  4 
 hw/net/rtl8139.c|  4 
 hw/net/sungem.c |  4 
 hw/net/sunhme.c |  4 
 hw/pci-bridge/dec.c |  8 
 hw/pci-bridge/i82801b11.c   |  4 
 hw/pci-bridge/pci_bridge_dev.c  |  1 +
 hw/pci-bridge/pci_expander_bridge.c |  8 
 hw/pci-host/apb.c   |  8 
 hw/pci-host/bonito.c|  4 
 hw/pci-host/gpex.c  |  4 
 hw/pci-host/grackle.c   |  4 
 hw/pci-host/piix.c  |  8 
 hw/pci-host/ppce500.c   |  4 
 hw/pci-host/prep.c  |  4 
 hw/pci-host/q35.c   |  4 
 hw/pci-host/uninorth.c  | 16 
 hw/pci-host/versatile.c |  4 
 hw/ppc/ppc4xx_pci.c |  4 
 hw/scsi/esp-pci.c   |  4 
 hw/scsi/lsi53c895a.c|  4 
 hw/scsi/megasas.c   |  4 
 hw/scsi/mptsas.c|  4 
 hw/sd/sdhci.c   |  4 
 hw/sh4/sh_pci.c |  4 
 hw/sparc64/sun4u.c  |  4 
 hw/usb/hcd-ehci-pci.c   |  4 
 hw/usb/hcd-ohci.c   |  4 
 hw/usb/hcd-uhci.c   |  4 
 hw/vfio/pci-quirks.c|  4 
 hw/watchdog/wdt_i6300esb.c  |  4 
 hw/xen/xen_pt.c |  4 
 

[Qemu-block] [PATCH v2 3/5] pci: Add INTERFACE_PCIE_DEVICE to all PCIe devices

2017-09-27 Thread Eduardo Habkost
Change all devices that set is_express=1 to implement
INTERFACE_PCIE_DEVICE.

Cc: Keith Busch 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Dmitry Fleytman 
Cc: Jason Wang 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: Paul Burton 
Cc: Paolo Bonzini 
Cc: Hannes Reinecke 
Cc: qemu-block@nongnu.org
Reviewed-by: Alistair Francis 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* base-xhci is marked as hybrid, now (in another patch)
* Included pcie-pci-bridge
---
 hw/block/nvme.c| 4 
 hw/net/e1000e.c| 4 
 hw/pci-bridge/pcie_pci_bridge.c| 1 +
 hw/pci-bridge/pcie_root_port.c | 4 
 hw/pci-bridge/xio3130_downstream.c | 4 
 hw/pci-bridge/xio3130_upstream.c   | 4 
 hw/pci-host/xilinx-pcie.c  | 4 
 hw/scsi/megasas.c  | 6 ++
 8 files changed, 31 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 9aa32692a3..441e21ed1f 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1110,6 +1110,10 @@ static const TypeInfo nvme_info = {
 .instance_size = sizeof(NvmeCtrl),
 .class_init= nvme_class_init,
 .instance_init = nvme_instance_init,
+.interfaces = (InterfaceInfo[]) {
+{ INTERFACE_PCIE_DEVICE },
+{ }
+},
 };
 
 static void nvme_register_types(void)
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 6c42b4478c..81f7934a59 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -708,6 +708,10 @@ static const TypeInfo e1000e_info = {
 .instance_size = sizeof(E1000EState),
 .class_init = e1000e_class_init,
 .instance_init = e1000e_instance_init,
+.interfaces = (InterfaceInfo[]) {
+{ INTERFACE_PCIE_DEVICE },
+{ }
+},
 };
 
 static void e1000e_register_types(void)
diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c
index 9aa5cc3e45..88db143633 100644
--- a/hw/pci-bridge/pcie_pci_bridge.c
+++ b/hw/pci-bridge/pcie_pci_bridge.c
@@ -180,6 +180,7 @@ static const TypeInfo pcie_pci_bridge_info = {
 .class_init = pcie_pci_bridge_class_init,
 .interfaces = (InterfaceInfo[]) {
 { TYPE_HOTPLUG_HANDLER },
+{ INTERFACE_PCIE_DEVICE },
 { },
 }
 };
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index 4d588cb22e..9b6e4ce512 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -161,6 +161,10 @@ static const TypeInfo rp_info = {
 .class_init= rp_class_init,
 .abstract  = true,
 .class_size = sizeof(PCIERootPortClass),
+.interfaces = (InterfaceInfo[]) {
+{ INTERFACE_PCIE_DEVICE },
+{ }
+},
 };
 
 static void rp_register_types(void)
diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index e706f36cb7..7d2f7629c1 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -195,6 +195,10 @@ static const TypeInfo xio3130_downstream_info = {
 .name  = "xio3130-downstream",
 .parent= TYPE_PCIE_SLOT,
 .class_init= xio3130_downstream_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ INTERFACE_PCIE_DEVICE },
+{ }
+},
 };
 
 static void xio3130_downstream_register_types(void)
diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c
index a052224bbf..227997ce46 100644
--- a/hw/pci-bridge/xio3130_upstream.c
+++ b/hw/pci-bridge/xio3130_upstream.c
@@ -166,6 +166,10 @@ static const TypeInfo xio3130_upstream_info = {
 .name  = "x3130-upstream",
 .parent= TYPE_PCIE_PORT,
 .class_init= xio3130_upstream_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ INTERFACE_PCIE_DEVICE },
+{ }
+},
 };
 
 static void xio3130_upstream_register_types(void)
diff --git a/hw/pci-host/xilinx-pcie.c b/hw/pci-host/xilinx-pcie.c
index 4613dda1d2..7659253090 100644
--- a/hw/pci-host/xilinx-pcie.c
+++ b/hw/pci-host/xilinx-pcie.c
@@ -317,6 +317,10 @@ static const TypeInfo xilinx_pcie_root_info = {
 .parent = TYPE_PCI_BRIDGE,
 .instance_size = sizeof(XilinxPCIERoot),
 .class_init = xilinx_pcie_root_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ INTERFACE_PCIE_DEVICE },
+{ }
+},
 };
 
 static void xilinx_pcie_register(void)
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 0db68aacee..535ee267c3 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -2451,6 +2451,7 @@ typedef struct MegasasInfo {
 int osts;
 const VMStateDescription *vmsd;
 Property *props;
+InterfaceInfo *interfaces;
 } MegasasInfo;
 
 static struct MegasasInfo megasas_devices[] = {
@@ -2480,6 +2481,10 @@ static struct MegasasInfo megasas_devices[] = {
 .is_express = true,
 .vmsd = &vmstate_megasas_gen2,
 .props = megasas_properties_gen2,
+.interfaces = (InterfaceInfo[]) {
+{ INTERFACE_PCIE_DEVICE },
+{ }
+},
 }
 };

Re: [Qemu-block] [Qemu-devel] [PATCH v2] virtio: introduce `info virtio' hmp command

2017-09-27 Thread Michael S. Tsirkin
On Wed, Sep 27, 2017 at 06:09:42PM +0300, Jan Dakinevich wrote:
> The command is intended for exposing device specific virtio feature bits
> and their negotiation status. It is convenient and useful for debug
> purpose.
> 
> Names of features are taken from a devices via get_feature_name() within
> VirtioDeviceClass. If certain device doesn't implement it, the command
> will print only hexadecimal value of feature mask.
> 
> Cc: Denis V. Lunev 
> Signed-off-by: Jan Dakinevich 

OK but if it's useful as an hmp command, why not as a qmp command?

> ---
> v2:
> * Moved hmp_info_virtio and stuff to hmp.c to avoid build error
> * Added printing of device status
> * Listed common virtio features
> 
> v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg07247.html
> ---
>  hmp-commands-info.hx|  14 +
>  hmp.c   | 148 
> 
>  hmp.h   |   1 +
>  hw/block/virtio-blk.c   |  20 ++
>  hw/char/virtio-serial-bus.c |  15 +
>  hw/display/virtio-gpu.c |  12 
>  hw/net/virtio-net.c |  34 ++
>  hw/scsi/virtio-scsi.c   |  15 +
>  hw/virtio/virtio-balloon.c  |  15 +
>  hw/virtio/virtio-bus.c  |   1 +
>  include/hw/virtio/virtio.h  |   2 +
>  monitor.c   |   1 +
>  12 files changed, 278 insertions(+)
> 
> diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
> index 4f1ece9..2550027 100644
> --- a/hmp-commands-info.hx
> +++ b/hmp-commands-info.hx
> @@ -868,6 +868,20 @@ ETEXI
>  },
>  
>  STEXI
> +@item info virtio
> +@findex virtio
> +Display guest and host fetures for all virtio devices.
> +ETEXI
> +
> +{
> +.name   = "virtio",
> +.args_type  = "",
> +.params = "",
> +.help   = "show virtio info",
> +.cmd= hmp_info_virtio,
> +},
> +
> +STEXI
>  @end table
>  ETEXI
>  
> diff --git a/hmp.c b/hmp.c
> index ace729d..009d808 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -43,6 +43,7 @@
>  #include "hw/intc/intc.h"
>  #include "migration/snapshot.h"
>  #include "migration/misc.h"
> +#include "hw/virtio/virtio.h"
>  
>  #ifdef CONFIG_SPICE
>  #include 
> @@ -2894,3 +2895,150 @@ void hmp_info_memory_size_summary(Monitor *mon, const 
> QDict *qdict)
>  }
>  hmp_handle_error(mon, &err);
>  }
> +
> +#define HMP_INFO_VIRTIO_INDENT 2
> +#define HMP_INFO_VIRTIO_FIELD 32
> +
> +static void hmp_info_virtio_print_status(Monitor *mon, VirtIODevice *vdev)
> +{
> +uint8_t status[] = {
> +VIRTIO_CONFIG_S_ACKNOWLEDGE,
> +VIRTIO_CONFIG_S_DRIVER,
> +VIRTIO_CONFIG_S_DRIVER_OK,
> +VIRTIO_CONFIG_S_FEATURES_OK,
> +VIRTIO_CONFIG_S_NEEDS_RESET,
> +VIRTIO_CONFIG_S_FAILED,
> +};
> +const char *names[] = {
> +"acknowledge",
> +"driver",
> +"driver_ok",
> +"features_ok",
> +"needs_reset"
> +"failed",
> +};
> +const char *comma = "";
> +int idx;
> +
> +monitor_printf(mon, "%*sstatus:   0x%02"PRIx8" ",
> +   HMP_INFO_VIRTIO_INDENT, "", vdev->status);
> +
> +for (idx = 0; idx < ARRAY_SIZE(status); idx++) {
> +if (!(vdev->status & status[idx])) {
> +continue;
> +}
> +monitor_printf(mon, "%s%s", comma, names[idx]);
> +if (names[idx]) {
> +comma = ",";
> +}
> +}
> +monitor_printf(mon, "\n");
> +}
> +
> +static void hmp_info_virtio_print_common_features(Monitor *mon,
> +  VirtIODevice *vdev)
> +{
> +uint64_t features[] = {
> +VIRTIO_RING_F_INDIRECT_DESC,
> +VIRTIO_RING_F_EVENT_IDX,
> +VIRTIO_F_NOTIFY_ON_EMPTY,
> +VIRTIO_F_ANY_LAYOUT,
> +VIRTIO_F_IOMMU_PLATFORM,
> +VIRTIO_F_VERSION_1,
> +};
> +const char *names[] = {
> +"indirect_desc",
> +"event_idx",
> +"notify_on_empty",
> +"any_layout",
> +"iommu_platform",
> +"version_1",
> +};
> +int idx;
> +
> +monitor_printf(mon, "%*scommon features:\n", HMP_INFO_VIRTIO_INDENT, "");
> +
> +for (idx = 0; idx < ARRAY_SIZE(features); idx++) {
> +const char *ack = vdev->guest_features & features[idx] ? "acked" : 
> "";
> +
> +if (!(vdev->host_features & features[idx])) {
> +continue;
> +}
> +monitor_printf(mon, "%*s%*s%*s\n", HMP_INFO_VIRTIO_INDENT, "",
> +   HMP_INFO_VIRTIO_FIELD, names[idx],
> +   HMP_INFO_VIRTIO_FIELD, ack);
> +}
> +}
> +
> +static void hmp_info_virtio_print_specific_features(Monitor *mon,
> +VirtIODevice *vdev)
> +{
> +VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> +int idx;
> +
> +if (!vdc->get_feature_name) {
> +return;
> +}
> +
> +monitor_printf(mon, "%*sspecific features:\n", HMP_INFO_VIRTIO_INDENT

Re: [Qemu-block] [Qemu-devel] [PATCH v4 12/23] block: Convert bdrv_get_block_status_above() to bytes

2017-09-27 Thread John Snow


On 09/27/2017 02:57 PM, Eric Blake wrote:
> On 09/27/2017 01:41 PM, John Snow wrote:
>>
>>
>> On 09/13/2017 12:03 PM, Eric Blake wrote:
>>> We are gradually moving away from sector-based interfaces, towards
>>> byte-based.  In the common case, allocation is unlikely to ever use
>>> values that are not naturally sector-aligned, but it is possible
>>> that byte-based values will let us be more precise about allocation
>>> at the end of an unaligned file that can do byte-based access.
>>>
>>> Changing the name of the function from bdrv_get_block_status_above()
>>> to bdrv_block_status_above() ensures that the compiler enforces that
>>> all callers are updated.  For now, the io.c layer still assert()s
>>> that all callers are sector-aligned, but that can be relaxed when a
>>> later patch implements byte-based block status in the drivers.
>>>
>>> For the most part this patch is just the addition of scaling at the
>>> callers followed by inverse scaling at bdrv_block_status().  But some
>>> code, particularly bdrv_block_status(), gets a lot simpler because
>>> it no longer has to mess with sectors.  Likewise, mirror code no
>>> longer computes s->granularity >> BDRV_SECTOR_BITS, and can therefore
>>> drop an assertion (fix a neighboring assertion to use is_power_of_2
>>> while there).
>>>
>>
>> Huh, I suppose so, yeah. Do you have a test that covers what happens in
>> this newly available use case?
> 
> Not directly - the mirror code no longer requires sector alignment, but
> is still unlikely to use sub-sector requests unless a particular driver
> returns really small status information.  I suppose we could tweak the
> blkdebug driver to force status requests to be fragmented at
> ridiculously small alignments, and then prove that mirroring still
> occurs correctly, once all the series are in, but it's probably more
> effort than it is worth to force sub-sector mirroring if we don't have a
> real use case that will rely on it.
> 

Hmm, yeah, the code probably can't be exercised currently but I do
wonder if we're removing too many breadcrumbs for potential problem
spots if someone decides to return sub-sector information in the future.

Well, I suppose I haven't been too diligent about complaining about
their removal elsewhere, so for consistency:

Either with or without the assertion removed as you see fit:

Reviewed-by: John Snow 

>>
>>> For ease of review, bdrv_get_block_status() was tackled separately.
>>>
>>> Signed-off-by: Eric Blake 
>>>
>>
>> Looks mechanically correct, anyway.
>>
>> Reviewed-by: John Snow 
>>
> 



Re: [Qemu-block] [Qemu-devel] [PATCH v4 13/23] qemu-img: Simplify logic in img_compare()

2017-09-27 Thread Eric Blake
On 09/27/2017 02:05 PM, John Snow wrote:
> 
> 
> On 09/13/2017 12:03 PM, Eric Blake wrote:
>> As long as we are querying the status for a chunk smaller than
>> the known image size, we are guaranteed that a successful return
>> will have set pnum to a non-zero size (pnum is zero only for
>> queries beyond the end of the file).  Use that to slightly
>> simplify the calculation of the current chunk size being compared.
>> Likewise, we don't have to shrink the amount of data operated on
>> until we know we have to read the file, and therefore have to fit
>> in the bounds of our buffer.  Also, note that 'total_sectors_over'
>> is equivalent to 'progress_base'.
>>
>> With these changes in place, sectors_to_process() is now dead code,
>> and can be removed.
>>
>> Signed-off-by: Eric Blake 
>>

>> @@ -1402,14 +1393,9 @@ static int img_compare(int argc, char **argv)
>>  goto out;
>>  }
>>  allocated2 = status2 & BDRV_BLOCK_ALLOCATED;
>> -if (pnum1) {
>> -nb_sectors = MIN(nb_sectors,
>> - DIV_ROUND_UP(pnum1, BDRV_SECTOR_SIZE));
>> -}
>> -if (pnum2) {
>> -nb_sectors = MIN(nb_sectors,
>> - DIV_ROUND_UP(pnum2, BDRV_SECTOR_SIZE));
>> -}
>> +
>> +assert(pnum1 && pnum2);
>> +nb_sectors = DIV_ROUND_UP(MIN(pnum1, pnum2), BDRV_SECTOR_SIZE);
> 
> In the apocalyptic future where non-sector sized returns are possible,
> does this math make sense?
> 
> e.g. say the return is zeroes, but it's not aligned anymore, so we
> assume we have an extra half a sector's worth of zeroes here.

Not introduced in this patch, but a good question for 12/23.  We want to
round up rather than down to ensure that we don't inf-loop on a partial
sector response; but at the same time, you're right that if we got a
report of a half-sector zero and we widen it, we can't guarantee that
the second half is zero.

On the bright side, this rounding goes away when later patches switch
img_compare to be byte-based, later in this series.  But you're right
that it is probably smarter to have 12/23 assert that things are already
aligned (and thus we don't need to round in the first place).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v4 13/23] qemu-img: Simplify logic in img_compare()

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> As long as we are querying the status for a chunk smaller than
> the known image size, we are guaranteed that a successful return
> will have set pnum to a non-zero size (pnum is zero only for
> queries beyond the end of the file).  Use that to slightly
> simplify the calculation of the current chunk size being compared.
> Likewise, we don't have to shrink the amount of data operated on
> until we know we have to read the file, and therefore have to fit
> in the bounds of our buffer.  Also, note that 'total_sectors_over'
> is equivalent to 'progress_base'.
> 
> With these changes in place, sectors_to_process() is now dead code,
> and can be removed.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v3: new patch
> ---
>  qemu-img.c | 40 +++-
>  1 file changed, 11 insertions(+), 29 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index b91133b922..f8423e9b3f 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -1171,11 +1171,6 @@ static int64_t sectors_to_bytes(int64_t sectors)
>  return sectors << BDRV_SECTOR_BITS;
>  }
> 
> -static int64_t sectors_to_process(int64_t total, int64_t from)
> -{
> -return MIN(total - from, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
> -}
> -
>  /*
>   * Check if passed sectors are empty (not allocated or contain only 0 bytes)
>   *
> @@ -1372,13 +1367,9 @@ static int img_compare(int argc, char **argv)
>  goto out;
>  }
> 
> -for (;;) {
> +while (sector_num < total_sectors) {
>  int64_t status1, status2;
> 
> -nb_sectors = sectors_to_process(total_sectors, sector_num);
> -if (nb_sectors <= 0) {
> -break;
> -}
>  status1 = bdrv_block_status_above(bs1, NULL,
>sector_num * BDRV_SECTOR_SIZE,
>(total_sectors1 - sector_num) *
> @@ -1402,14 +1393,9 @@ static int img_compare(int argc, char **argv)
>  goto out;
>  }
>  allocated2 = status2 & BDRV_BLOCK_ALLOCATED;
> -if (pnum1) {
> -nb_sectors = MIN(nb_sectors,
> - DIV_ROUND_UP(pnum1, BDRV_SECTOR_SIZE));
> -}
> -if (pnum2) {
> -nb_sectors = MIN(nb_sectors,
> - DIV_ROUND_UP(pnum2, BDRV_SECTOR_SIZE));
> -}
> +
> +assert(pnum1 && pnum2);
> +nb_sectors = DIV_ROUND_UP(MIN(pnum1, pnum2), BDRV_SECTOR_SIZE);

In the apocalyptic future where non-sector sized returns are possible,
does this math make sense?

e.g. say the return is zeroes, but it's not aligned anymore, so we
assume we have an extra half a sector's worth of zeroes here.

> 
>  if (strict) {
>  if ((status1 & ~BDRV_BLOCK_OFFSET_MASK) !=
> @@ -1422,9 +1408,10 @@ static int img_compare(int argc, char **argv)
>  }
>  }
>  if ((status1 & BDRV_BLOCK_ZERO) && (status2 & BDRV_BLOCK_ZERO)) {
> -nb_sectors = DIV_ROUND_UP(MIN(pnum1, pnum2), BDRV_SECTOR_SIZE);
> +/* nothing to do */
>  } else if (allocated1 == allocated2) {
>  if (allocated1) {
> +nb_sectors = MIN(nb_sectors, IO_BUF_SIZE >> 
> BDRV_SECTOR_BITS);
>  ret = blk_pread(blk1, sector_num << BDRV_SECTOR_BITS, buf1,
>  nb_sectors << BDRV_SECTOR_BITS);
>  if (ret < 0) {
> @@ -1453,7 +1440,7 @@ static int img_compare(int argc, char **argv)
>  }
>  }
>  } else {
> -
> +nb_sectors = MIN(nb_sectors, IO_BUF_SIZE >> BDRV_SECTOR_BITS);
>  if (allocated1) {
>  ret = check_empty_sectors(blk1, sector_num, nb_sectors,
>filename1, buf1, quiet);
> @@ -1476,30 +1463,24 @@ static int img_compare(int argc, char **argv)
> 
>  if (total_sectors1 != total_sectors2) {
>  BlockBackend *blk_over;
> -int64_t total_sectors_over;
>  const char *filename_over;
> 
>  qprintf(quiet, "Warning: Image size mismatch!\n");
>  if (total_sectors1 > total_sectors2) {
> -total_sectors_over = total_sectors1;
>  blk_over = blk1;
>  filename_over = filename1;
>  } else {
> -total_sectors_over = total_sectors2;
>  blk_over = blk2;
>  filename_over = filename2;
>  }
> 
> -for (;;) {
> +while (sector_num < progress_base) {
>  int64_t count;
> 
> -nb_sectors = sectors_to_process(total_sectors_over, sector_num);
> -if (nb_sectors <= 0) {
> -break;
> -}
>  ret = bdrv_is_allocated_above(blk_bs(blk_over), NULL,
>sector_num * BDRV_SECTOR_SIZE,
> -  nb_sectors * BDRV_SECTOR_SIZE,
> +

Re: [Qemu-block] [Qemu-devel] [PATCH v4 12/23] block: Convert bdrv_get_block_status_above() to bytes

2017-09-27 Thread Eric Blake
On 09/27/2017 01:41 PM, John Snow wrote:
> 
> 
> On 09/13/2017 12:03 PM, Eric Blake wrote:
>> We are gradually moving away from sector-based interfaces, towards
>> byte-based.  In the common case, allocation is unlikely to ever use
>> values that are not naturally sector-aligned, but it is possible
>> that byte-based values will let us be more precise about allocation
>> at the end of an unaligned file that can do byte-based access.
>>
>> Changing the name of the function from bdrv_get_block_status_above()
>> to bdrv_block_status_above() ensures that the compiler enforces that
>> all callers are updated.  For now, the io.c layer still assert()s
>> that all callers are sector-aligned, but that can be relaxed when a
>> later patch implements byte-based block status in the drivers.
>>
>> For the most part this patch is just the addition of scaling at the
>> callers followed by inverse scaling at bdrv_block_status().  But some
>> code, particularly bdrv_block_status(), gets a lot simpler because
>> it no longer has to mess with sectors.  Likewise, mirror code no
>> longer computes s->granularity >> BDRV_SECTOR_BITS, and can therefore
>> drop an assertion (fix a neighboring assertion to use is_power_of_2
>> while there).
>>
> 
> Huh, I suppose so, yeah. Do you have a test that covers what happens in
> this newly available use case?

Not directly - the mirror code no longer requires sector alignment, but
is still unlikely to use sub-sector requests unless a particular driver
returns really small status information.  I suppose we could tweak the
blkdebug driver to force status requests to be fragmented at
ridiculously small alignments, and then prove that mirroring still
occurs correctly, once all the series are in, but it's probably more
effort than it is worth to force sub-sector mirroring if we don't have a
real use case that will rely on it.

> 
>> For ease of review, bdrv_get_block_status() was tackled separately.
>>
>> Signed-off-by: Eric Blake 
>>
> 
> Looks mechanically correct, anyway.
> 
> Reviewed-by: John Snow 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v4 12/23] block: Convert bdrv_get_block_status_above() to bytes

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  In the common case, allocation is unlikely to ever use
> values that are not naturally sector-aligned, but it is possible
> that byte-based values will let us be more precise about allocation
> at the end of an unaligned file that can do byte-based access.
> 
> Changing the name of the function from bdrv_get_block_status_above()
> to bdrv_block_status_above() ensures that the compiler enforces that
> all callers are updated.  For now, the io.c layer still assert()s
> that all callers are sector-aligned, but that can be relaxed when a
> later patch implements byte-based block status in the drivers.
> 
> For the most part this patch is just the addition of scaling at the
> callers followed by inverse scaling at bdrv_block_status().  But some
> code, particularly bdrv_block_status(), gets a lot simpler because
> it no longer has to mess with sectors.  Likewise, mirror code no
> longer computes s->granularity >> BDRV_SECTOR_BITS, and can therefore
> drop an assertion (fix a neighboring assertion to use is_power_of_2
> while there).
> 

Huh, I suppose so, yeah. Do you have a test that covers what happens in
this newly available use case?

> For ease of review, bdrv_get_block_status() was tackled separately.
> 
> Signed-off-by: Eric Blake 
> 

Looks mechanically correct, anyway.

Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH v4 11/23] block: Switch bdrv_co_get_block_status_above() to byte-based

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> We are gradually converting to byte-based interfaces, as they are
> easier to reason about than sector-based.  Convert another internal
> type (no semantic change), and rename it to match the corresponding
> public function rename.
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: Fam Zheng 


Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH v4 10/23] block: Switch bdrv_common_block_status_above() to byte-based

2017-09-27 Thread John Snow


On 09/13/2017 12:03 PM, Eric Blake wrote:
> We are gradually converting to byte-based interfaces, as they are
> easier to reason about than sector-based.  Convert another internal
> function (no semantic change).
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: Fam Zheng 

Reviewed-by: John Snow 



[Qemu-block] [Qemu-devel] [PATCH v3] virtio: introduce `info virtio' hmp command

2017-09-27 Thread Jan Dakinevich
The command is intended for exposing device specific virtio feature bits
and their negotiation status. It is convenient and useful for debug
purpose.

Names of features are taken from a devices via get_feature_name() within
VirtioDeviceClass. If certain device doesn't implement it, the command
will print only hexadecimal value of feature mask.

Cc: Denis V. Lunev 
Signed-off-by: Jan Dakinevich 
---

v3:
* Avoid signed int
* Use virtio_vdev_has_feature()/virtio_host_has_feature()

v2: http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg07527.html
v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg07247.html
---
 hmp-commands-info.hx|  14 +
 hmp.c   | 149 
 hmp.h   |   1 +
 hw/block/virtio-blk.c   |  21 +++
 hw/char/virtio-serial-bus.c |  15 +
 hw/display/virtio-gpu.c |  13 
 hw/net/virtio-net.c |  35 +++
 hw/scsi/virtio-scsi.c   |  16 +
 hw/virtio/virtio-balloon.c  |  15 +
 hw/virtio/virtio-bus.c  |   1 +
 include/hw/virtio/virtio.h  |   2 +
 monitor.c   |   1 +
 12 files changed, 283 insertions(+)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 4f1ece9..2550027 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -868,6 +868,20 @@ ETEXI
 },
 
 STEXI
+@item info virtio
+@findex virtio
+Display guest and host fetures for all virtio devices.
+ETEXI
+
+{
+.name   = "virtio",
+.args_type  = "",
+.params = "",
+.help   = "show virtio info",
+.cmd= hmp_info_virtio,
+},
+
+STEXI
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index ace729d..f231395 100644
--- a/hmp.c
+++ b/hmp.c
@@ -43,6 +43,7 @@
 #include "hw/intc/intc.h"
 #include "migration/snapshot.h"
 #include "migration/misc.h"
+#include "hw/virtio/virtio.h"
 
 #ifdef CONFIG_SPICE
 #include 
@@ -2894,3 +2895,151 @@ void hmp_info_memory_size_summary(Monitor *mon, const 
QDict *qdict)
 }
 hmp_handle_error(mon, &err);
 }
+
+#define HMP_INFO_VIRTIO_INDENT 2
+#define HMP_INFO_VIRTIO_FIELD 32
+
+static void hmp_info_virtio_print_status(Monitor *mon, VirtIODevice *vdev)
+{
+uint8_t status[] = {
+VIRTIO_CONFIG_S_ACKNOWLEDGE,
+VIRTIO_CONFIG_S_DRIVER,
+VIRTIO_CONFIG_S_DRIVER_OK,
+VIRTIO_CONFIG_S_FEATURES_OK,
+VIRTIO_CONFIG_S_NEEDS_RESET,
+VIRTIO_CONFIG_S_FAILED,
+};
+const char *names[] = {
+"acknowledge",
+"driver",
+"driver_ok",
+"features_ok",
+"needs_reset"
+"failed",
+};
+const char *comma = "";
+unsigned idx;
+
+monitor_printf(mon, "%*sstatus:   0x%02"PRIx8" ",
+   HMP_INFO_VIRTIO_INDENT, "", vdev->status);
+
+for (idx = 0; idx < ARRAY_SIZE(status); idx++) {
+if (!(vdev->status & status[idx])) {
+continue;
+}
+monitor_printf(mon, "%s%s", comma, names[idx]);
+if (names[idx]) {
+comma = ",";
+}
+}
+monitor_printf(mon, "\n");
+}
+
+static void hmp_info_virtio_print_common_features(Monitor *mon,
+  VirtIODevice *vdev)
+{
+uint64_t features[] = {
+VIRTIO_RING_F_INDIRECT_DESC,
+VIRTIO_RING_F_EVENT_IDX,
+VIRTIO_F_NOTIFY_ON_EMPTY,
+VIRTIO_F_ANY_LAYOUT,
+VIRTIO_F_IOMMU_PLATFORM,
+VIRTIO_F_VERSION_1,
+};
+const char *names[] = {
+"indirect_desc",
+"event_idx",
+"notify_on_empty",
+"any_layout",
+"iommu_platform",
+"version_1",
+};
+unsigned idx;
+
+monitor_printf(mon, "%*scommon features:\n", HMP_INFO_VIRTIO_INDENT, "");
+
+for (idx = 0; idx < ARRAY_SIZE(features); idx++) {
+const char *ack = virtio_vdev_has_feature(vdev, features[idx])
+? "acked" : "";
+
+if (!virtio_host_has_feature(vdev, features[idx])) {
+continue;
+}
+monitor_printf(mon, "%*s%*s%*s\n", HMP_INFO_VIRTIO_INDENT, "",
+   HMP_INFO_VIRTIO_FIELD, names[idx],
+   HMP_INFO_VIRTIO_FIELD, ack);
+}
+}
+
+static void hmp_info_virtio_print_specific_features(Monitor *mon,
+VirtIODevice *vdev)
+{
+VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+unsigned idx;
+
+if (!vdc->get_feature_name) {
+return;
+}
+
+monitor_printf(mon, "%*sspecific features:\n", HMP_INFO_VIRTIO_INDENT, "");
+
+for (idx = 0; idx < 64; idx++) {
+const char *name = vdc->get_feature_name(vdev, idx);
+const char *ack = virtio_vdev_has_feature(vdev, idx) ? "acked" : "";
+
+if (!name) {
+continue;
+}
+if (!virtio_host_has_feature(vdev, idx)) {
+continue;
+}
+monitor_printf(mon, "%*s%*s%*s\n", HMP_INFO_VIRT

Re: [Qemu-block] [Qemu-devel] [PATCH v2] virtio: introduce `info virtio' hmp command

2017-09-27 Thread Jan Dakinevich


On 09/27/2017 06:53 PM, Dr. David Alan Gilbert wrote:
> * Jan Dakinevich (jan.dakinev...@virtuozzo.com) wrote:
>> The command is intended for exposing device specific virtio feature bits
>> and their negotiation status. It is convenient and useful for debug
>> purpose.
>>
>> Names of features are taken from a devices via get_feature_name() within
>> VirtioDeviceClass. If certain device doesn't implement it, the command
>> will print only hexadecimal value of feature mask.
>>
>> Cc: Denis V. Lunev 
>> Signed-off-by: Jan Dakinevich 
>> ---
>> v2:
>> * Moved hmp_info_virtio and stuff to hmp.c to avoid build error
>> * Added printing of device status
>> * Listed common virtio features
>>
>> v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg07247.html
>> ---
>>  hmp-commands-info.hx|  14 +
>>  hmp.c   | 148 
>> 
>>  hmp.h   |   1 +
>>  hw/block/virtio-blk.c   |  20 ++
>>  hw/char/virtio-serial-bus.c |  15 +
>>  hw/display/virtio-gpu.c |  12 
>>  hw/net/virtio-net.c |  34 ++
>>  hw/scsi/virtio-scsi.c   |  15 +
>>  hw/virtio/virtio-balloon.c  |  15 +
>>  hw/virtio/virtio-bus.c  |   1 +
>>  include/hw/virtio/virtio.h  |   2 +
>>  monitor.c   |   1 +
>>  12 files changed, 278 insertions(+)
>>
>> diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
>> index 4f1ece9..2550027 100644
>> --- a/hmp-commands-info.hx
>> +++ b/hmp-commands-info.hx
>> @@ -868,6 +868,20 @@ ETEXI
>>  },
>>  
>>  STEXI
>> +@item info virtio
>> +@findex virtio
>> +Display guest and host fetures for all virtio devices.
>> +ETEXI
>> +
>> +{
>> +.name   = "virtio",
>> +.args_type  = "",
>> +.params = "",
>> +.help   = "show virtio info",
>> +.cmd= hmp_info_virtio,
>> +},
>> +
>> +STEXI
>>  @end table
>>  ETEXI
>>  
>> diff --git a/hmp.c b/hmp.c
>> index ace729d..009d808 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -43,6 +43,7 @@
>>  #include "hw/intc/intc.h"
>>  #include "migration/snapshot.h"
>>  #include "migration/misc.h"
>> +#include "hw/virtio/virtio.h"
>>  
>>  #ifdef CONFIG_SPICE
>>  #include 
>> @@ -2894,3 +2895,150 @@ void hmp_info_memory_size_summary(Monitor *mon, 
>> const QDict *qdict)
>>  }
>>  hmp_handle_error(mon, &err);
>>  }
>> +
>> +#define HMP_INFO_VIRTIO_INDENT 2
>> +#define HMP_INFO_VIRTIO_FIELD 32
>> +
>> +static void hmp_info_virtio_print_status(Monitor *mon, VirtIODevice *vdev)
>> +{
>> +uint8_t status[] = {
>> +VIRTIO_CONFIG_S_ACKNOWLEDGE,
>> +VIRTIO_CONFIG_S_DRIVER,
>> +VIRTIO_CONFIG_S_DRIVER_OK,
>> +VIRTIO_CONFIG_S_FEATURES_OK,
>> +VIRTIO_CONFIG_S_NEEDS_RESET,
>> +VIRTIO_CONFIG_S_FAILED,
>> +};
>> +const char *names[] = {
>> +"acknowledge",
>> +"driver",
>> +"driver_ok",
>> +"features_ok",
>> +"needs_reset"
>> +"failed",
>> +};
>> +const char *comma = "";
>> +int idx;
>> +
>> +monitor_printf(mon, "%*sstatus:   0x%02"PRIx8" ",
>> +   HMP_INFO_VIRTIO_INDENT, "", vdev->status);
>> +
>> +for (idx = 0; idx < ARRAY_SIZE(status); idx++) {
>> +if (!(vdev->status & status[idx])) {
>> +continue;
>> +}
>> +monitor_printf(mon, "%s%s", comma, names[idx]);
>> +if (names[idx]) {
>> +comma = ",";
>> +}
>> +}
>> +monitor_printf(mon, "\n");
>> +}
>> +
>> +static void hmp_info_virtio_print_common_features(Monitor *mon,
>> +  VirtIODevice *vdev)
>> +{
>> +uint64_t features[] = {
>> +VIRTIO_RING_F_INDIRECT_DESC,
>> +VIRTIO_RING_F_EVENT_IDX,
>> +VIRTIO_F_NOTIFY_ON_EMPTY,
>> +VIRTIO_F_ANY_LAYOUT,
>> +VIRTIO_F_IOMMU_PLATFORM,
>> +VIRTIO_F_VERSION_1,
>> +};
>> +const char *names[] = {
>> +"indirect_desc",
>> +"event_idx",
>> +"notify_on_empty",
>> +"any_layout",
>> +"iommu_platform",
>> +"version_1",
>> +};
>> +int idx;
>> +
>> +monitor_printf(mon, "%*scommon features:\n", HMP_INFO_VIRTIO_INDENT, 
>> "");
>> +
>> +for (idx = 0; idx < ARRAY_SIZE(features); idx++) {
>> +const char *ack = vdev->guest_features & features[idx] ? "acked" : 
>> "";
>> +
>> +if (!(vdev->host_features & features[idx])) {
>> +continue;
>> +}
>> +monitor_printf(mon, "%*s%*s%*s\n", HMP_INFO_VIRTIO_INDENT, "",
>> +   HMP_INFO_VIRTIO_FIELD, names[idx],
>> +   HMP_INFO_VIRTIO_FIELD, ack);
>> +}
>> +}
>> +
>> +static void hmp_info_virtio_print_specific_features(Monitor *mon,
>> +VirtIODevice *vdev)
>> +{
>> +VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>> +int idx;
>> +
>> +

Re: [Qemu-block] [PULL 00/24] Block layer patches

2017-09-27 Thread Peter Maydell
On 26 September 2017 at 07:21, Kevin Wolf  wrote:
> The following changes since commit 1e3ee834083227f552179f6e43902cba5a866e6b:
>
>   Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into 
> staging (2017-09-25 20:31:24 +0100)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to b156d51b62e6970753e1f9f36f7c4d5fdbf4c619:
>
>   Merge remote-tracking branch 'mreitz/tags/pull-block-2017-09-26' into 
> queue-block (2017-09-26 15:03:02 +0200)
>
> 
> Block layer patches
>
> 

Applied, thanks.

-- PMM



Re: [Qemu-block] [PATCH v2 2/2] qcow2: truncate the tail of the image file after shrinking the image

2017-09-27 Thread Max Reitz
On 2017-09-27 18:27, Pavel Butsykin wrote:
> On 27.09.2017 19:00, Max Reitz wrote:
>> On 2017-09-22 11:39, Pavel Butsykin wrote:
>>> Now after shrinking the image, at the end of the image file, there
>>> might be a
>>> tail that probably will never be used. So we can find the last used
>>> cluster and
>>> cut the tail.
>>>
>>> Signed-off-by: Pavel Butsykin 
>>> ---
>>>   block/qcow2-refcount.c | 22 ++
>>>   block/qcow2.c  | 23 +++
>>>   block/qcow2.h  |  1 +
>>>   3 files changed, 46 insertions(+)
>>>
>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>> index 88d5a3f1ad..aa3fd6cf17 100644
>>> --- a/block/qcow2-refcount.c
>>> +++ b/block/qcow2-refcount.c
>>> @@ -3181,3 +3181,25 @@ out:
>>>   g_free(reftable_tmp);
>>>   return ret;
>>>   }
>>> +
>>> +int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size)
>>> +{
>>> +    BDRVQcow2State *s = bs->opaque;
>>> +    int64_t i;
>>> +
>>> +    for (i = size_to_clusters(s, size) - 1; i >= 0; i--) {
>>> +    uint64_t refcount;
>>> +    int ret = qcow2_get_refcount(bs, i, &refcount);
>>> +    if (ret < 0) {
>>> +    fprintf(stderr, "Can't get refcount for cluster %"
>>> PRId64 ": %s\n",
>>> +    i, strerror(-ret));
>>> +    return ret;
>>> +    }
>>> +    if (refcount > 0) {
>>> +    return i;
>>> +    }
>>> +    }
>>> +    qcow2_signal_corruption(bs, true, -1, -1,
>>> +    "There are no references in the refcount
>>> table.");
>>> +    return -EIO;
>>> +}
>>> diff --git a/block/qcow2.c b/block/qcow2.c
>>> index 8a4311d338..8dfb5393a7 100644
>>> --- a/block/qcow2.c
>>> +++ b/block/qcow2.c
>>> @@ -3106,6 +3106,7 @@ static int qcow2_truncate(BlockDriverState *bs,
>>> int64_t offset,
>>>   new_l1_size = size_to_l1(s, offset);
>>>     if (offset < old_length) {
>>> +    int64_t last_cluster, old_file_size;
>>>   if (prealloc != PREALLOC_MODE_OFF) {
>>>   error_setg(errp,
>>>  "Preallocation can't be used for shrinking
>>> an image");
>>> @@ -3134,6 +3135,28 @@ static int qcow2_truncate(BlockDriverState
>>> *bs, int64_t offset,
>>>    "Failed to discard unused refblocks");
>>>   return ret;
>>>   }
>>> +
>>> +    old_file_size = bdrv_getlength(bs->file->bs);
>>> +    if (old_file_size < 0) {
>>> +    error_setg_errno(errp, -old_file_size,
>>> + "Failed to inquire current file length");
>>> +    return old_file_size;
>>> +    }
>>> +    last_cluster = qcow2_get_last_cluster(bs, old_file_size);
>>> +    if (last_cluster < 0) {
>>> +    error_setg_errno(errp, -last_cluster,
>>> + "Failed to find the last cluster");
>>> +    return last_cluster;
>>> +    }
>>
>> My idea was rather that you just wouldn't truncate the image file if
>> something fails here.  So in any of these new cases where you currently
>> just report the whole truncate operation as having failed, you could
>> just emit a warning and not do the truncation of bs->file.
> 
> I'm not sure that's right. If the qcow2_get_last_cluster() returned an
> error, probably with the image was a problem.. can we continue to work
> with the image without risking to damage it even more? if something bad
> happened with the reftable we usually mark the image as corrupted, it's
> the same thing.

Well, the only thing that's left to do is to write the new size into the
image header, so I think that should work just fine...

I won't disagree that bdrv_getlength() or qcow2_get_last_cluster()
failing may be reasons to stop truncation (although I don't think they
necessarily are at this point).

But I could well imagine that the below bdrv_truncate() of bs->file
fails for benign reasons, e.g. because the underlying protocol does not
support shrinking of images or something.  Then we probably should carry on.

Max

> Although if the shrink is almost complete, the truncation of bs->file
> isn't so important thing and we could update qcow2 header.
> 
>> I can live with the current version, though, so:
>>
>> Reviewed-by: Max Reitz 
>>
>> But I'll wait for a response from you before merging this series.
>>
>> Max
>>
>>> +    if ((last_cluster + 1) * s->cluster_size < old_file_size) {
>>> +    ret = bdrv_truncate(bs->file, (last_cluster + 1) *
>>> s->cluster_size,
>>> +    PREALLOC_MODE_OFF, NULL);
>>> +    if (ret < 0) {
>>> +    error_setg_errno(errp, -ret,
>>> + "Failed to truncate the tail of the
>>> image");
>>> +    return ret;
>>> +    }
>>> +    }
>>>   } else {
>>>   ret = qcow2_grow_l1_table(bs, new_l1_size, true);
>>>   if (ret < 0) {
>>> diff --git a/block/qcow2.h b/block/qcow2.h
>>> index 5a289a81e2..782a2

Re: [Qemu-block] blockdev-commit design

2017-09-27 Thread Max Reitz
On 2017-09-26 19:59, Kevin Wolf wrote:

[...]

>   * The old block-commit command decides between an "actual" commit job
> and the mirror-based active commit based on whether top is the
> active layer.
> 
> We don't get an option for the active layer any more now, so this
> isn't how things can work for blockdev-commit. We could probably
> check whether top has a BlockBackend parent, but that's not really
> what we're interested in anyway. Maybe the best we could do to
> decide this automatically is to check whether there is any parent of
> top that requires write permissions. If there is, we need active
> commit, otherwise the "normal" one is good enough.
> 
> However, who says that the intentions of the user stay as we deduce
> them at the start of the block job? Who says that the user doesn't
> want to add a writable disk as a user of the node while the block
> job is running?
> 
> The optimal solution to this would be that the commit filter node
> responds to permission requests and switches between active and
> "normal" commit mode. I'm not sure how hard this would be to
> implement.
> 
> As long as we don't have the automatic switch, do we have to allow
> the user to specify explicitly which mode they want instead of
> automatically choosing one?

Probably a stupid question: What advantage does the commit job have over
the mirror job? I.e. would it be possible to just always do a mirror?

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 2/2] qcow2: truncate the tail of the image file after shrinking the image

2017-09-27 Thread Pavel Butsykin

On 27.09.2017 19:00, Max Reitz wrote:

On 2017-09-22 11:39, Pavel Butsykin wrote:

Now after shrinking the image, at the end of the image file, there might be a
tail that probably will never be used. So we can find the last used cluster and
cut the tail.

Signed-off-by: Pavel Butsykin 
---
  block/qcow2-refcount.c | 22 ++
  block/qcow2.c  | 23 +++
  block/qcow2.h  |  1 +
  3 files changed, 46 insertions(+)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 88d5a3f1ad..aa3fd6cf17 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -3181,3 +3181,25 @@ out:
  g_free(reftable_tmp);
  return ret;
  }
+
+int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size)
+{
+BDRVQcow2State *s = bs->opaque;
+int64_t i;
+
+for (i = size_to_clusters(s, size) - 1; i >= 0; i--) {
+uint64_t refcount;
+int ret = qcow2_get_refcount(bs, i, &refcount);
+if (ret < 0) {
+fprintf(stderr, "Can't get refcount for cluster %" PRId64 ": %s\n",
+i, strerror(-ret));
+return ret;
+}
+if (refcount > 0) {
+return i;
+}
+}
+qcow2_signal_corruption(bs, true, -1, -1,
+"There are no references in the refcount table.");
+return -EIO;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 8a4311d338..8dfb5393a7 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3106,6 +3106,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t 
offset,
  new_l1_size = size_to_l1(s, offset);
  
  if (offset < old_length) {

+int64_t last_cluster, old_file_size;
  if (prealloc != PREALLOC_MODE_OFF) {
  error_setg(errp,
 "Preallocation can't be used for shrinking an image");
@@ -3134,6 +3135,28 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t 
offset,
   "Failed to discard unused refblocks");
  return ret;
  }
+
+old_file_size = bdrv_getlength(bs->file->bs);
+if (old_file_size < 0) {
+error_setg_errno(errp, -old_file_size,
+ "Failed to inquire current file length");
+return old_file_size;
+}
+last_cluster = qcow2_get_last_cluster(bs, old_file_size);
+if (last_cluster < 0) {
+error_setg_errno(errp, -last_cluster,
+ "Failed to find the last cluster");
+return last_cluster;
+}


My idea was rather that you just wouldn't truncate the image file if
something fails here.  So in any of these new cases where you currently
just report the whole truncate operation as having failed, you could
just emit a warning and not do the truncation of bs->file.


I'm not sure that's right. If the qcow2_get_last_cluster() returned an
error, probably with the image was a problem.. can we continue to work
with the image without risking to damage it even more? if something bad
happened with the reftable we usually mark the image as corrupted, it's
the same thing.

Although if the shrink is almost complete, the truncation of bs->file
isn't so important thing and we could update qcow2 header.


I can live with the current version, though, so:

Reviewed-by: Max Reitz 

But I'll wait for a response from you before merging this series.

Max


+if ((last_cluster + 1) * s->cluster_size < old_file_size) {
+ret = bdrv_truncate(bs->file, (last_cluster + 1) * s->cluster_size,
+PREALLOC_MODE_OFF, NULL);
+if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed to truncate the tail of the image");
+return ret;
+}
+}
  } else {
  ret = qcow2_grow_l1_table(bs, new_l1_size, true);
  if (ret < 0) {
diff --git a/block/qcow2.h b/block/qcow2.h
index 5a289a81e2..782a206ecb 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -597,6 +597,7 @@ int qcow2_change_refcount_order(BlockDriverState *bs, int 
refcount_order,
  BlockDriverAmendStatusCB *status_cb,
  void *cb_opaque, Error **errp);
  int qcow2_shrink_reftable(BlockDriverState *bs);
+int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size);
  
  /* qcow2-cluster.c functions */

  int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,








Re: [Qemu-block] [PATCH v2 2/2] qcow2: truncate the tail of the image file after shrinking the image

2017-09-27 Thread Max Reitz
On 2017-09-22 11:39, Pavel Butsykin wrote:
> Now after shrinking the image, at the end of the image file, there might be a
> tail that probably will never be used. So we can find the last used cluster 
> and
> cut the tail.
> 
> Signed-off-by: Pavel Butsykin 
> ---
>  block/qcow2-refcount.c | 22 ++
>  block/qcow2.c  | 23 +++
>  block/qcow2.h  |  1 +
>  3 files changed, 46 insertions(+)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 88d5a3f1ad..aa3fd6cf17 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -3181,3 +3181,25 @@ out:
>  g_free(reftable_tmp);
>  return ret;
>  }
> +
> +int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +int64_t i;
> +
> +for (i = size_to_clusters(s, size) - 1; i >= 0; i--) {
> +uint64_t refcount;
> +int ret = qcow2_get_refcount(bs, i, &refcount);
> +if (ret < 0) {
> +fprintf(stderr, "Can't get refcount for cluster %" PRId64 ": 
> %s\n",
> +i, strerror(-ret));
> +return ret;
> +}
> +if (refcount > 0) {
> +return i;
> +}
> +}
> +qcow2_signal_corruption(bs, true, -1, -1,
> +"There are no references in the refcount 
> table.");
> +return -EIO;
> +}
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 8a4311d338..8dfb5393a7 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3106,6 +3106,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t 
> offset,
>  new_l1_size = size_to_l1(s, offset);
>  
>  if (offset < old_length) {
> +int64_t last_cluster, old_file_size;
>  if (prealloc != PREALLOC_MODE_OFF) {
>  error_setg(errp,
> "Preallocation can't be used for shrinking an image");
> @@ -3134,6 +3135,28 @@ static int qcow2_truncate(BlockDriverState *bs, 
> int64_t offset,
>   "Failed to discard unused refblocks");
>  return ret;
>  }
> +
> +old_file_size = bdrv_getlength(bs->file->bs);
> +if (old_file_size < 0) {
> +error_setg_errno(errp, -old_file_size,
> + "Failed to inquire current file length");
> +return old_file_size;
> +}
> +last_cluster = qcow2_get_last_cluster(bs, old_file_size);
> +if (last_cluster < 0) {
> +error_setg_errno(errp, -last_cluster,
> + "Failed to find the last cluster");
> +return last_cluster;
> +}

My idea was rather that you just wouldn't truncate the image file if
something fails here.  So in any of these new cases where you currently
just report the whole truncate operation as having failed, you could
just emit a warning and not do the truncation of bs->file.

I can live with the current version, though, so:

Reviewed-by: Max Reitz 

But I'll wait for a response from you before merging this series.

Max

> +if ((last_cluster + 1) * s->cluster_size < old_file_size) {
> +ret = bdrv_truncate(bs->file, (last_cluster + 1) * 
> s->cluster_size,
> +PREALLOC_MODE_OFF, NULL);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret,
> + "Failed to truncate the tail of the image");
> +return ret;
> +}
> +}
>  } else {
>  ret = qcow2_grow_l1_table(bs, new_l1_size, true);
>  if (ret < 0) {
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 5a289a81e2..782a206ecb 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -597,6 +597,7 @@ int qcow2_change_refcount_order(BlockDriverState *bs, int 
> refcount_order,
>  BlockDriverAmendStatusCB *status_cb,
>  void *cb_opaque, Error **errp);
>  int qcow2_shrink_reftable(BlockDriverState *bs);
> +int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size);
>  
>  /* qcow2-cluster.c functions */
>  int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2] virtio: introduce `info virtio' hmp command

2017-09-27 Thread Dr. David Alan Gilbert
* Jan Dakinevich (jan.dakinev...@virtuozzo.com) wrote:
> The command is intended for exposing device specific virtio feature bits
> and their negotiation status. It is convenient and useful for debug
> purpose.
> 
> Names of features are taken from a devices via get_feature_name() within
> VirtioDeviceClass. If certain device doesn't implement it, the command
> will print only hexadecimal value of feature mask.
> 
> Cc: Denis V. Lunev 
> Signed-off-by: Jan Dakinevich 
> ---
> v2:
> * Moved hmp_info_virtio and stuff to hmp.c to avoid build error
> * Added printing of device status
> * Listed common virtio features
> 
> v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg07247.html
> ---
>  hmp-commands-info.hx|  14 +
>  hmp.c   | 148 
> 
>  hmp.h   |   1 +
>  hw/block/virtio-blk.c   |  20 ++
>  hw/char/virtio-serial-bus.c |  15 +
>  hw/display/virtio-gpu.c |  12 
>  hw/net/virtio-net.c |  34 ++
>  hw/scsi/virtio-scsi.c   |  15 +
>  hw/virtio/virtio-balloon.c  |  15 +
>  hw/virtio/virtio-bus.c  |   1 +
>  include/hw/virtio/virtio.h  |   2 +
>  monitor.c   |   1 +
>  12 files changed, 278 insertions(+)
> 
> diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
> index 4f1ece9..2550027 100644
> --- a/hmp-commands-info.hx
> +++ b/hmp-commands-info.hx
> @@ -868,6 +868,20 @@ ETEXI
>  },
>  
>  STEXI
> +@item info virtio
> +@findex virtio
> +Display guest and host fetures for all virtio devices.
> +ETEXI
> +
> +{
> +.name   = "virtio",
> +.args_type  = "",
> +.params = "",
> +.help   = "show virtio info",
> +.cmd= hmp_info_virtio,
> +},
> +
> +STEXI
>  @end table
>  ETEXI
>  
> diff --git a/hmp.c b/hmp.c
> index ace729d..009d808 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -43,6 +43,7 @@
>  #include "hw/intc/intc.h"
>  #include "migration/snapshot.h"
>  #include "migration/misc.h"
> +#include "hw/virtio/virtio.h"
>  
>  #ifdef CONFIG_SPICE
>  #include 
> @@ -2894,3 +2895,150 @@ void hmp_info_memory_size_summary(Monitor *mon, const 
> QDict *qdict)
>  }
>  hmp_handle_error(mon, &err);
>  }
> +
> +#define HMP_INFO_VIRTIO_INDENT 2
> +#define HMP_INFO_VIRTIO_FIELD 32
> +
> +static void hmp_info_virtio_print_status(Monitor *mon, VirtIODevice *vdev)
> +{
> +uint8_t status[] = {
> +VIRTIO_CONFIG_S_ACKNOWLEDGE,
> +VIRTIO_CONFIG_S_DRIVER,
> +VIRTIO_CONFIG_S_DRIVER_OK,
> +VIRTIO_CONFIG_S_FEATURES_OK,
> +VIRTIO_CONFIG_S_NEEDS_RESET,
> +VIRTIO_CONFIG_S_FAILED,
> +};
> +const char *names[] = {
> +"acknowledge",
> +"driver",
> +"driver_ok",
> +"features_ok",
> +"needs_reset"
> +"failed",
> +};
> +const char *comma = "";
> +int idx;
> +
> +monitor_printf(mon, "%*sstatus:   0x%02"PRIx8" ",
> +   HMP_INFO_VIRTIO_INDENT, "", vdev->status);
> +
> +for (idx = 0; idx < ARRAY_SIZE(status); idx++) {
> +if (!(vdev->status & status[idx])) {
> +continue;
> +}
> +monitor_printf(mon, "%s%s", comma, names[idx]);
> +if (names[idx]) {
> +comma = ",";
> +}
> +}
> +monitor_printf(mon, "\n");
> +}
> +
> +static void hmp_info_virtio_print_common_features(Monitor *mon,
> +  VirtIODevice *vdev)
> +{
> +uint64_t features[] = {
> +VIRTIO_RING_F_INDIRECT_DESC,
> +VIRTIO_RING_F_EVENT_IDX,
> +VIRTIO_F_NOTIFY_ON_EMPTY,
> +VIRTIO_F_ANY_LAYOUT,
> +VIRTIO_F_IOMMU_PLATFORM,
> +VIRTIO_F_VERSION_1,
> +};
> +const char *names[] = {
> +"indirect_desc",
> +"event_idx",
> +"notify_on_empty",
> +"any_layout",
> +"iommu_platform",
> +"version_1",
> +};
> +int idx;
> +
> +monitor_printf(mon, "%*scommon features:\n", HMP_INFO_VIRTIO_INDENT, "");
> +
> +for (idx = 0; idx < ARRAY_SIZE(features); idx++) {
> +const char *ack = vdev->guest_features & features[idx] ? "acked" : 
> "";
> +
> +if (!(vdev->host_features & features[idx])) {
> +continue;
> +}
> +monitor_printf(mon, "%*s%*s%*s\n", HMP_INFO_VIRTIO_INDENT, "",
> +   HMP_INFO_VIRTIO_FIELD, names[idx],
> +   HMP_INFO_VIRTIO_FIELD, ack);
> +}
> +}
> +
> +static void hmp_info_virtio_print_specific_features(Monitor *mon,
> +VirtIODevice *vdev)
> +{
> +VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> +int idx;
> +
> +if (!vdc->get_feature_name) {
> +return;
> +}
> +
> +monitor_printf(mon, "%*sspecific features:\n", HMP_INFO_VIRTIO_INDENT, 
> "");
> +
> +for (idx = 0; idx < 64; idx++) {
> +const char *n

[Qemu-block] [PATCH v1.1 DRAFT] nbd: Minimal structured read for client

2017-09-27 Thread Vladimir Sementsov-Ogievskiy
Minimal implementation: drop most of additional error information.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi!

Here is a draft of how to refactor reply-payload receiving if you don't
like my previous simple (but not flexible) try. Ofcourse, if we agree on this
approach this patch should be splitted into several ones and many things
(error handling) should be improved.

The idea is:

nbd_read_reply_entry reads only reply header through nbd/client.c code.

Then, the payload is read through block/nbd-client-cmds.c:
simple payload: generic per-command handler, however it should only exist
  for CMD_READ
structured NONE: no payload, handle in nbd_co_receive_one_chunk
structured error: read by nbd_handle_structured_error_payload
  defined in block/nbd-client-cmds.c
structured success: read by per-[command X reply-type] handler
  defined in block/nbd-client-cmds.c

For now nbd-client-cmds.c looks more like nbd-payload.c, but, may be, we
should move command sending special-casing (CMD_WRITE) to it too..

Don't waste time on careful reviewing this patch, let's consider first
the concept of nbd-client-cmds.c,

 block/nbd-client.h  |  10 +++
 include/block/nbd.h |  82 --
 nbd/nbd-internal.h  |  25 --
 block/nbd-client-cmds.c | 220 
 block/nbd-client.c  | 118 --
 nbd/client.c| 128 
 block/Makefile.objs |   2 +-
 7 files changed, 475 insertions(+), 110 deletions(-)
 create mode 100644 block/nbd-client-cmds.c

diff --git a/block/nbd-client.h b/block/nbd-client.h
index b435754b82..abb88e4ea5 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -35,6 +35,8 @@ typedef struct NBDClientSession {
 NBDClientRequest requests[MAX_NBD_REQUESTS];
 NBDReply reply;
 bool quit;
+
+bool structured_reply;
 } NBDClientSession;
 
 NBDClientSession *nbd_get_client_session(BlockDriverState *bs);
@@ -60,4 +62,12 @@ void nbd_client_detach_aio_context(BlockDriverState *bs);
 void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context);
 
+int nbd_handle_structured_payload(QIOChannel *ioc, int cmd,
+  NBDStructuredReplyChunk *reply, void 
*opaque);
+int nbd_handle_simple_payload(QIOChannel *ioc, int cmd, void *opaque);
+
+int nbd_handle_structured_error_payload(QIOChannel *ioc,
+NBDStructuredReplyChunk *reply,
+int *request_ret);
+
 #endif /* NBD_CLIENT_H */
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 314f2f9bbc..b9a4e1dfa9 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -57,12 +57,6 @@ struct NBDRequest {
 };
 typedef struct NBDRequest NBDRequest;
 
-struct NBDReply {
-uint64_t handle;
-uint32_t error;
-};
-typedef struct NBDReply NBDReply;
-
 typedef struct NBDSimpleReply {
 uint32_t magic;  /* NBD_SIMPLE_REPLY_MAGIC */
 uint32_t error;
@@ -77,6 +71,24 @@ typedef struct NBDStructuredReplyChunk {
 uint32_t length; /* length of payload */
 } QEMU_PACKED NBDStructuredReplyChunk;
 
+typedef union NBDReply {
+NBDSimpleReply simple;
+NBDStructuredReplyChunk structured;
+struct {
+uint32_t magic;
+uint32_t _skip;
+uint64_t handle;
+} QEMU_PACKED;
+} NBDReply;
+
+#define NBD_SIMPLE_REPLY_MAGIC  0x67446698
+#define NBD_STRUCTURED_REPLY_MAGIC  0x668e33ef
+
+static inline bool nbd_reply_is_simple(NBDReply *reply)
+{
+return reply->magic == NBD_SIMPLE_REPLY_MAGIC;
+}
+
 typedef struct NBDStructuredRead {
 NBDStructuredReplyChunk h;
 uint64_t offset;
@@ -88,6 +100,11 @@ typedef struct NBDStructuredError {
 uint16_t message_length;
 } QEMU_PACKED NBDStructuredError;
 
+typedef struct NBDPayloadOffsetHole {
+uint64_t offset;
+uint32_t hole_size;
+} QEMU_PACKED NBDPayloadOffsetHole;
+
 /* Transmission (export) flags: sent from server to client during handshake,
but describe what will happen during transmission */
 #define NBD_FLAG_HAS_FLAGS (1 << 0) /* Flags are there */
@@ -178,12 +195,54 @@ enum {
 
 #define NBD_SREP_TYPE_NONE  0
 #define NBD_SREP_TYPE_OFFSET_DATA   1
+#define NBD_SREP_TYPE_OFFSET_HOLE   2
 #define NBD_SREP_TYPE_ERROR NBD_SREP_ERR(1)
+#define NBD_SREP_TYPE_ERROR_OFFSET  NBD_SREP_ERR(2)
+
+static inline bool nbd_srep_type_is_error(int type)
+{
+return type & (1 << 15);
+}
+
+/* NBD errors are based on errno numbers, so there is a 1:1 mapping,
+ * but only a limited set of errno values is specified in the protocol.
+ * Everything else is squashed to EINVAL.
+ */
+#define NBD_SUCCESS0
+#define NBD_EPERM  1
+#define NBD_EIO5
+#define NBD_ENOMEM 12
+#define NBD_EINVAL 22
+#define NBD_ENOSPC 28
+#define NBD_ESHUTDOWN  108
+
+static inline int nbd_errno_to_system_errno(int err)
+{
+switch (err) {
+case NBD_SUCCES

[Qemu-block] [Qemu-devel] [PATCH v2] virtio: introduce `info virtio' hmp command

2017-09-27 Thread Jan Dakinevich
The command is intended for exposing device specific virtio feature bits
and their negotiation status. It is convenient and useful for debug
purpose.

Names of features are taken from a devices via get_feature_name() within
VirtioDeviceClass. If certain device doesn't implement it, the command
will print only hexadecimal value of feature mask.

Cc: Denis V. Lunev 
Signed-off-by: Jan Dakinevich 
---
v2:
* Moved hmp_info_virtio and stuff to hmp.c to avoid build error
* Added printing of device status
* Listed common virtio features

v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg07247.html
---
 hmp-commands-info.hx|  14 +
 hmp.c   | 148 
 hmp.h   |   1 +
 hw/block/virtio-blk.c   |  20 ++
 hw/char/virtio-serial-bus.c |  15 +
 hw/display/virtio-gpu.c |  12 
 hw/net/virtio-net.c |  34 ++
 hw/scsi/virtio-scsi.c   |  15 +
 hw/virtio/virtio-balloon.c  |  15 +
 hw/virtio/virtio-bus.c  |   1 +
 include/hw/virtio/virtio.h  |   2 +
 monitor.c   |   1 +
 12 files changed, 278 insertions(+)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 4f1ece9..2550027 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -868,6 +868,20 @@ ETEXI
 },
 
 STEXI
+@item info virtio
+@findex virtio
+Display guest and host fetures for all virtio devices.
+ETEXI
+
+{
+.name   = "virtio",
+.args_type  = "",
+.params = "",
+.help   = "show virtio info",
+.cmd= hmp_info_virtio,
+},
+
+STEXI
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index ace729d..009d808 100644
--- a/hmp.c
+++ b/hmp.c
@@ -43,6 +43,7 @@
 #include "hw/intc/intc.h"
 #include "migration/snapshot.h"
 #include "migration/misc.h"
+#include "hw/virtio/virtio.h"
 
 #ifdef CONFIG_SPICE
 #include 
@@ -2894,3 +2895,150 @@ void hmp_info_memory_size_summary(Monitor *mon, const 
QDict *qdict)
 }
 hmp_handle_error(mon, &err);
 }
+
+#define HMP_INFO_VIRTIO_INDENT 2
+#define HMP_INFO_VIRTIO_FIELD 32
+
+static void hmp_info_virtio_print_status(Monitor *mon, VirtIODevice *vdev)
+{
+uint8_t status[] = {
+VIRTIO_CONFIG_S_ACKNOWLEDGE,
+VIRTIO_CONFIG_S_DRIVER,
+VIRTIO_CONFIG_S_DRIVER_OK,
+VIRTIO_CONFIG_S_FEATURES_OK,
+VIRTIO_CONFIG_S_NEEDS_RESET,
+VIRTIO_CONFIG_S_FAILED,
+};
+const char *names[] = {
+"acknowledge",
+"driver",
+"driver_ok",
+"features_ok",
+"needs_reset"
+"failed",
+};
+const char *comma = "";
+int idx;
+
+monitor_printf(mon, "%*sstatus:   0x%02"PRIx8" ",
+   HMP_INFO_VIRTIO_INDENT, "", vdev->status);
+
+for (idx = 0; idx < ARRAY_SIZE(status); idx++) {
+if (!(vdev->status & status[idx])) {
+continue;
+}
+monitor_printf(mon, "%s%s", comma, names[idx]);
+if (names[idx]) {
+comma = ",";
+}
+}
+monitor_printf(mon, "\n");
+}
+
+static void hmp_info_virtio_print_common_features(Monitor *mon,
+  VirtIODevice *vdev)
+{
+uint64_t features[] = {
+VIRTIO_RING_F_INDIRECT_DESC,
+VIRTIO_RING_F_EVENT_IDX,
+VIRTIO_F_NOTIFY_ON_EMPTY,
+VIRTIO_F_ANY_LAYOUT,
+VIRTIO_F_IOMMU_PLATFORM,
+VIRTIO_F_VERSION_1,
+};
+const char *names[] = {
+"indirect_desc",
+"event_idx",
+"notify_on_empty",
+"any_layout",
+"iommu_platform",
+"version_1",
+};
+int idx;
+
+monitor_printf(mon, "%*scommon features:\n", HMP_INFO_VIRTIO_INDENT, "");
+
+for (idx = 0; idx < ARRAY_SIZE(features); idx++) {
+const char *ack = vdev->guest_features & features[idx] ? "acked" : "";
+
+if (!(vdev->host_features & features[idx])) {
+continue;
+}
+monitor_printf(mon, "%*s%*s%*s\n", HMP_INFO_VIRTIO_INDENT, "",
+   HMP_INFO_VIRTIO_FIELD, names[idx],
+   HMP_INFO_VIRTIO_FIELD, ack);
+}
+}
+
+static void hmp_info_virtio_print_specific_features(Monitor *mon,
+VirtIODevice *vdev)
+{
+VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+int idx;
+
+if (!vdc->get_feature_name) {
+return;
+}
+
+monitor_printf(mon, "%*sspecific features:\n", HMP_INFO_VIRTIO_INDENT, "");
+
+for (idx = 0; idx < 64; idx++) {
+const char *name = vdc->get_feature_name(vdev, idx);
+const char *ack = vdev->guest_features & (1 << idx) ? "acked" : "";
+
+if (!name) {
+continue;
+}
+if (!(vdev->host_features & (1 << idx))) {
+continue;
+}
+monitor_printf(mon, "%*s%*s%*s\n", HMP_INFO_VIRTIO_INDENT, "",
+   HMP_INFO_VIRTIO_FIELD, name, HMP_

Re: [Qemu-block] [PATCH v4 5/6] block: convert qcrypto_block_encrypt|decrypt to take bytes offset

2017-09-27 Thread Eric Blake
On 09/27/2017 07:53 AM, Daniel P. Berrange wrote:
> Instead of sector offset, take the bytes offset when encrypting
> or decrypting data.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 4/6] block: convert crypto driver to bdrv_co_preadv|pwritev

2017-09-27 Thread Eric Blake
On 09/27/2017 07:53 AM, Daniel P. Berrange wrote:
> Make the crypto driver implement the bdrv_co_preadv|pwritev
> callbacks, and also use bdrv_co_preadv|pwritev for I/O
> with the protocol driver beneath. This replaces sector based
> I/O with byte based I/O, and allows us to stop assuming the
> physical sector size matches the encryption sector size.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block/crypto.c | 106 
> +
>  1 file changed, 54 insertions(+), 52 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 1/6] block: use 1 MB bounce buffers for crypto instead of 16KB

2017-09-27 Thread Eric Blake
On 09/27/2017 07:53 AM, Daniel P. Berrange wrote:
> Using 16KB bounce buffers creates a significant performance
> penalty for I/O to encrypted volumes on storage which high
> I/O latency (rotating rust & network drives), because it
> triggers lots of fairly small I/O operations.
> 
> On tests with rotating rust, and cache=none|directsync,
> write speed increased from 2MiB/s to 32MiB/s, on a par
> with that achieved by the in-kernel luks driver. With
> other cache modes the in-kernel driver is still notably
> faster because it is able to report completion of the
> I/O request before any encryption is done, while the
> in-QEMU driver must encrypt the data before completion.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  block/crypto.c | 28 +++-
>  1 file changed, 15 insertions(+), 13 deletions(-)

Reviewed-by: Eric Blake 


> @@ -464,12 +467,11 @@ block_crypto_co_writev(BlockDriverState *bs, int64_t 
> sector_num,
>  
>  qemu_iovec_init(&hd_qiov, qiov->niov);
>  
> -/* Bounce buffer so we have a linear mem region for
> - * entire sector. XXX optimize so we avoid bounce
> - * buffer in case that qiov->niov == 1
> +/* Bounce buffer because we're not permitted to touch
> + * contents of qiov - it points to guest memory.
>   */
>  cipher_data =
> -qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
> +qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_IO_SIZE,
>qiov->size));

We allocate and free a bounce buffer for every write - is there anything
we can do to reduce malloc() calls by reusing a bounce buffer associated
with a coroutine (we have to be careful that parallel coroutines don't
share bounce buffers, of course).  But that's independent of this patch.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH 2/5] iotests: Set up Python logging

2017-09-27 Thread Daniel P. Berrange
On Wed, Sep 27, 2017 at 10:03:36AM -0300, Eduardo Habkost wrote:
> Set up Python logging module instead of relying on
> QEMUMachine._debug to enable debugging messages.
> 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> Cc: qemu-block@nongnu.org
> Signed-off-by: Eduardo Habkost 
> ---
>  tests/qemu-iotests/iotests.py | 3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



[Qemu-block] [PATCH 2/5] iotests: Set up Python logging

2017-09-27 Thread Eduardo Habkost
Set up Python logging module instead of relying on
QEMUMachine._debug to enable debugging messages.

Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: qemu-block@nongnu.org
Signed-off-by: Eduardo Habkost 
---
 tests/qemu-iotests/iotests.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 1af117e37d..36a7757aaf 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -28,6 +28,7 @@ import qtest
 import struct
 import json
 import signal
+import logging
 
 
 # This will not work if arguments contain spaces but is necessary if we
@@ -467,6 +468,8 @@ def main(supported_fmts=[], supported_oses=['linux']):
 else:
 output = StringIO.StringIO()
 
+logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN))
+
 class MyTestRunner(unittest.TextTestRunner):
 def __init__(self, stream=output, descriptions=True, 
verbosity=verbosity):
 unittest.TextTestRunner.__init__(self, stream, descriptions, 
verbosity)
-- 
2.13.5




[Qemu-block] [PATCH v4 5/6] block: convert qcrypto_block_encrypt|decrypt to take bytes offset

2017-09-27 Thread Daniel P. Berrange
Instead of sector offset, take the bytes offset when encrypting
or decrypting data.

Signed-off-by: Daniel P. Berrange 
---
 block/crypto.c | 12 
 block/qcow.c   | 11 +++
 block/qcow2-cluster.c  |  8 +++-
 block/qcow2.c  |  4 ++--
 crypto/block-luks.c| 12 
 crypto/block-qcow.c| 12 
 crypto/block.c | 20 ++--
 crypto/blockpriv.h |  4 ++--
 include/crypto/block.h | 14 --
 9 files changed, 56 insertions(+), 41 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 965c173b01..edf53d49d1 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -398,7 +398,6 @@ block_crypto_co_preadv(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 int ret = 0;
 uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
 uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
-uint64_t sector_num = offset / sector_size;
 
 assert(!flags);
 assert(payload_offset < INT64_MAX);
@@ -430,15 +429,14 @@ block_crypto_co_preadv(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 goto cleanup;
 }
 
-if (qcrypto_block_decrypt(crypto->block, sector_num, cipher_data,
-  cur_bytes, NULL) < 0) {
+if (qcrypto_block_decrypt(crypto->block, offset + bytes_done,
+  cipher_data, cur_bytes, NULL) < 0) {
 ret = -EIO;
 goto cleanup;
 }
 
 qemu_iovec_from_buf(qiov, bytes_done, cipher_data, cur_bytes);
 
-sector_num += cur_bytes / sector_size;
 bytes -= cur_bytes;
 bytes_done += cur_bytes;
 }
@@ -463,7 +461,6 @@ block_crypto_co_pwritev(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 int ret = 0;
 uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
 uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
-uint64_t sector_num = offset / sector_size;
 
 assert(!flags);
 assert(payload_offset < INT64_MAX);
@@ -488,8 +485,8 @@ block_crypto_co_pwritev(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 
 qemu_iovec_to_buf(qiov, bytes_done, cipher_data, cur_bytes);
 
-if (qcrypto_block_encrypt(crypto->block, sector_num, cipher_data,
-  cur_bytes, NULL) < 0) {
+if (qcrypto_block_encrypt(crypto->block, offset + bytes_done,
+  cipher_data, cur_bytes, NULL) < 0) {
 ret = -EIO;
 goto cleanup;
 }
@@ -503,7 +500,6 @@ block_crypto_co_pwritev(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 goto cleanup;
 }
 
-sector_num += cur_bytes / sector_size;
 bytes -= cur_bytes;
 bytes_done += cur_bytes;
 }
diff --git a/block/qcow.c b/block/qcow.c
index f450b00cfc..9569deeaf0 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -478,7 +478,9 @@ static int get_cluster_offset(BlockDriverState *bs,
 for(i = 0; i < s->cluster_sectors; i++) {
 if (i < n_start || i >= n_end) {
 memset(s->cluster_data, 0x00, 512);
-if (qcrypto_block_encrypt(s->crypto, start_sect + 
i,
+if (qcrypto_block_encrypt(s->crypto,
+  (start_sect + i) *
+  BDRV_SECTOR_SIZE,
   s->cluster_data,
   BDRV_SECTOR_SIZE,
   NULL) < 0) {
@@ -668,7 +670,8 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, 
int64_t sector_num,
 }
 if (bs->encrypted) {
 assert(s->crypto);
-if (qcrypto_block_decrypt(s->crypto, sector_num, buf,
+if (qcrypto_block_decrypt(s->crypto,
+  sector_num * BDRV_SECTOR_SIZE, buf,
   n * BDRV_SECTOR_SIZE, NULL) < 0) {
 ret = -EIO;
 break;
@@ -740,8 +743,8 @@ static coroutine_fn int qcow_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 }
 if (bs->encrypted) {
 assert(s->crypto);
-if (qcrypto_block_encrypt(s->crypto, sector_num, buf,
-  n * BDRV_SECTOR_SIZE, NULL) < 0) {
+if (qcrypto_block_encrypt(s->crypto, sector_num * BDRV_SECTOR_SIZE,
+  buf, n * BDRV_SECTOR_SIZE, NULL) < 0) {
 ret = -EIO;
 break;
 }
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 0d4824993c..09ae0b6ecc 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -396,15

[Qemu-block] [PATCH v4 4/6] block: convert crypto driver to bdrv_co_preadv|pwritev

2017-09-27 Thread Daniel P. Berrange
Make the crypto driver implement the bdrv_co_preadv|pwritev
callbacks, and also use bdrv_co_preadv|pwritev for I/O
with the protocol driver beneath. This replaces sector based
I/O with byte based I/O, and allows us to stop assuming the
physical sector size matches the encryption sector size.

Signed-off-by: Daniel P. Berrange 
---
 block/crypto.c | 106 +
 1 file changed, 54 insertions(+), 52 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 61f5d77bc0..965c173b01 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -387,18 +387,23 @@ static void block_crypto_close(BlockDriverState *bs)
 #define BLOCK_CRYPTO_MAX_IO_SIZE (1024 * 1024)
 
 static coroutine_fn int
-block_crypto_co_readv(BlockDriverState *bs, int64_t sector_num,
-  int remaining_sectors, QEMUIOVector *qiov)
+block_crypto_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+   QEMUIOVector *qiov, int flags)
 {
 BlockCrypto *crypto = bs->opaque;
-int cur_nr_sectors; /* number of sectors in current iteration */
+uint64_t cur_bytes; /* number of bytes in current iteration */
 uint64_t bytes_done = 0;
 uint8_t *cipher_data = NULL;
 QEMUIOVector hd_qiov;
 int ret = 0;
-uint64_t payload_offset =
-qcrypto_block_get_payload_offset(crypto->block) / 512;
-assert(payload_offset < (INT64_MAX / 512));
+uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
+uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
+uint64_t sector_num = offset / sector_size;
+
+assert(!flags);
+assert(payload_offset < INT64_MAX);
+assert(QEMU_IS_ALIGNED(offset, sector_size));
+assert(QEMU_IS_ALIGNED(bytes, sector_size));
 
 qemu_iovec_init(&hd_qiov, qiov->niov);
 
@@ -413,37 +418,29 @@ block_crypto_co_readv(BlockDriverState *bs, int64_t 
sector_num,
 goto cleanup;
 }
 
-while (remaining_sectors) {
-cur_nr_sectors = remaining_sectors;
-
-if (cur_nr_sectors > (BLOCK_CRYPTO_MAX_IO_SIZE / 512)) {
-cur_nr_sectors = (BLOCK_CRYPTO_MAX_IO_SIZE / 512);
-}
+while (bytes) {
+cur_bytes = MIN(bytes, BLOCK_CRYPTO_MAX_IO_SIZE);
 
 qemu_iovec_reset(&hd_qiov);
-qemu_iovec_add(&hd_qiov, cipher_data, cur_nr_sectors * 512);
+qemu_iovec_add(&hd_qiov, cipher_data, cur_bytes);
 
-ret = bdrv_co_readv(bs->file,
-payload_offset + sector_num,
-cur_nr_sectors, &hd_qiov);
+ret = bdrv_co_preadv(bs->file, payload_offset + offset + bytes_done,
+ cur_bytes, &hd_qiov, 0);
 if (ret < 0) {
 goto cleanup;
 }
 
-if (qcrypto_block_decrypt(crypto->block,
-  sector_num,
-  cipher_data, cur_nr_sectors * 512,
-  NULL) < 0) {
+if (qcrypto_block_decrypt(crypto->block, sector_num, cipher_data,
+  cur_bytes, NULL) < 0) {
 ret = -EIO;
 goto cleanup;
 }
 
-qemu_iovec_from_buf(qiov, bytes_done,
-cipher_data, cur_nr_sectors * 512);
+qemu_iovec_from_buf(qiov, bytes_done, cipher_data, cur_bytes);
 
-remaining_sectors -= cur_nr_sectors;
-sector_num += cur_nr_sectors;
-bytes_done += cur_nr_sectors * 512;
+sector_num += cur_bytes / sector_size;
+bytes -= cur_bytes;
+bytes_done += cur_bytes;
 }
 
  cleanup:
@@ -455,18 +452,23 @@ block_crypto_co_readv(BlockDriverState *bs, int64_t 
sector_num,
 
 
 static coroutine_fn int
-block_crypto_co_writev(BlockDriverState *bs, int64_t sector_num,
-   int remaining_sectors, QEMUIOVector *qiov)
+block_crypto_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+QEMUIOVector *qiov, int flags)
 {
 BlockCrypto *crypto = bs->opaque;
-int cur_nr_sectors; /* number of sectors in current iteration */
+uint64_t cur_bytes; /* number of bytes in current iteration */
 uint64_t bytes_done = 0;
 uint8_t *cipher_data = NULL;
 QEMUIOVector hd_qiov;
 int ret = 0;
-uint64_t payload_offset =
-qcrypto_block_get_payload_offset(crypto->block) / 512;
-assert(payload_offset < (INT64_MAX / 512));
+uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
+uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
+uint64_t sector_num = offset / sector_size;
+
+assert(!flags);
+assert(payload_offset < INT64_MAX);
+assert(QEMU_IS_ALIGNED(offset, sector_size));
+assert(QEMU_IS_ALIGNED(bytes, sector_size));
 
 qemu_iovec_init(&hd_qiov, qiov->niov);
 
@@ -481,37 +483,29 @@ block_crypto_co_writev(BlockDriverState *bs, int64_t 
sector_num,
 goto cleanup;
 

[Qemu-block] [PATCH v4 2/6] crypto: expose encryption sector size in APIs

2017-09-27 Thread Daniel P. Berrange
While current encryption schemes all have a fixed sector size of
512 bytes, this is not guaranteed to be the case in future. Expose
the sector size in the APIs so the block layer can remove assumptions
about fixed 512 byte sectors.

Reviewed-by: Max Reitz 
Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrange 
---
 crypto/block-luks.c|  6 --
 crypto/block-qcow.c|  1 +
 crypto/block.c |  6 ++
 crypto/blockpriv.h |  1 +
 include/crypto/block.h | 15 +++
 5 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 36bc856084..a9062bb0f2 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -846,8 +846,9 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 }
 }
 
+block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
 block->payload_offset = luks->header.payload_offset *
-QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
+block->sector_size;
 
 luks->cipher_alg = cipheralg;
 luks->cipher_mode = ciphermode;
@@ -1240,8 +1241,9 @@ qcrypto_block_luks_create(QCryptoBlock *block,
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) *
  QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);
 
+block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
 block->payload_offset = luks->header.payload_offset *
-QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
+block->sector_size;
 
 /* Reserve header space to match payload offset */
 initfunc(block, block->payload_offset, opaque, &local_err);
diff --git a/crypto/block-qcow.c b/crypto/block-qcow.c
index a456fe338b..4dd594a9ba 100644
--- a/crypto/block-qcow.c
+++ b/crypto/block-qcow.c
@@ -80,6 +80,7 @@ qcrypto_block_qcow_init(QCryptoBlock *block,
 goto fail;
 }
 
+block->sector_size = QCRYPTO_BLOCK_QCOW_SECTOR_SIZE;
 block->payload_offset = 0;
 
 return 0;
diff --git a/crypto/block.c b/crypto/block.c
index c382393d9a..a7a9ad240e 100644
--- a/crypto/block.c
+++ b/crypto/block.c
@@ -170,6 +170,12 @@ uint64_t qcrypto_block_get_payload_offset(QCryptoBlock 
*block)
 }
 
 
+uint64_t qcrypto_block_get_sector_size(QCryptoBlock *block)
+{
+return block->sector_size;
+}
+
+
 void qcrypto_block_free(QCryptoBlock *block)
 {
 if (!block) {
diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 0edb810e22..d227522d88 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -36,6 +36,7 @@ struct QCryptoBlock {
 QCryptoHashAlgorithm kdfhash;
 size_t niv;
 uint64_t payload_offset; /* In bytes */
+uint64_t sector_size; /* In bytes */
 };
 
 struct QCryptoBlockDriver {
diff --git a/include/crypto/block.h b/include/crypto/block.h
index f0e543bee1..13232b2472 100644
--- a/include/crypto/block.h
+++ b/include/crypto/block.h
@@ -241,6 +241,21 @@ QCryptoHashAlgorithm 
qcrypto_block_get_kdf_hash(QCryptoBlock *block);
 uint64_t qcrypto_block_get_payload_offset(QCryptoBlock *block);
 
 /**
+ * qcrypto_block_get_sector_size:
+ * @block: the block encryption object
+ *
+ * Get the size of sectors used for payload encryption. A new
+ * IV is used at the start of each sector. The encryption
+ * sector size is not required to match the sector size of the
+ * underlying storage. For example LUKS will always use a 512
+ * byte sector size, even if the volume is on a disk with 4k
+ * sectors.
+ *
+ * Returns: the sector in bytes
+ */
+uint64_t qcrypto_block_get_sector_size(QCryptoBlock *block);
+
+/**
  * qcrypto_block_free:
  * @block: the block encryption object
  *
-- 
2.13.5




[Qemu-block] [PATCH v4 6/6] block: support passthrough of BDRV_REQ_FUA in crypto driver

2017-09-27 Thread Daniel P. Berrange
The BDRV_REQ_FUA flag can trivially be allowed in the crypt driver
as a passthrough to the underlying block driver.

Reviewed-by: Max Reitz 
Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrange 
---
 block/crypto.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index edf53d49d1..60ddf8623e 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -279,6 +279,9 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
 return -EINVAL;
 }
 
+bs->supported_write_flags = BDRV_REQ_FUA &
+bs->file->bs->supported_write_flags;
+
 opts = qemu_opts_create(opts_spec, NULL, 0, &error_abort);
 qemu_opts_absorb_qdict(opts, options, &local_err);
 if (local_err) {
@@ -462,7 +465,7 @@ block_crypto_co_pwritev(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
 uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
 
-assert(!flags);
+assert(!(flags & ~BDRV_REQ_FUA));
 assert(payload_offset < INT64_MAX);
 assert(QEMU_IS_ALIGNED(offset, sector_size));
 assert(QEMU_IS_ALIGNED(bytes, sector_size));
@@ -495,7 +498,7 @@ block_crypto_co_pwritev(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 qemu_iovec_add(&hd_qiov, cipher_data, cur_bytes);
 
 ret = bdrv_co_pwritev(bs->file, payload_offset + offset + bytes_done,
-  cur_bytes, &hd_qiov, 0);
+  cur_bytes, &hd_qiov, flags);
 if (ret < 0) {
 goto cleanup;
 }
-- 
2.13.5




[Qemu-block] [PATCH v4 3/6] block: fix data type casting for crypto payload offset

2017-09-27 Thread Daniel P. Berrange
The crypto APIs report the offset of the data payload as an uint64_t
type, but the block driver is casting to size_t or ssize_t which will
potentially truncate.

Most of the block APIs use int64_t for offsets meanwhile, so even if
using uint64_t in the crypto block driver we are still at risk of
truncation.

Change the block crypto driver to use uint64_t, but add asserts that
the value is less than INT64_MAX.

Reviewed-by: Max Reitz 
Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrange 
---
 block/crypto.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 684cabeaf8..61f5d77bc0 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -364,8 +364,9 @@ static int block_crypto_truncate(BlockDriverState *bs, 
int64_t offset,
  PreallocMode prealloc, Error **errp)
 {
 BlockCrypto *crypto = bs->opaque;
-size_t payload_offset =
+uint64_t payload_offset =
 qcrypto_block_get_payload_offset(crypto->block);
+assert(payload_offset < (INT64_MAX - offset));
 
 offset += payload_offset;
 
@@ -395,8 +396,9 @@ block_crypto_co_readv(BlockDriverState *bs, int64_t 
sector_num,
 uint8_t *cipher_data = NULL;
 QEMUIOVector hd_qiov;
 int ret = 0;
-size_t payload_offset =
+uint64_t payload_offset =
 qcrypto_block_get_payload_offset(crypto->block) / 512;
+assert(payload_offset < (INT64_MAX / 512));
 
 qemu_iovec_init(&hd_qiov, qiov->niov);
 
@@ -462,8 +464,9 @@ block_crypto_co_writev(BlockDriverState *bs, int64_t 
sector_num,
 uint8_t *cipher_data = NULL;
 QEMUIOVector hd_qiov;
 int ret = 0;
-size_t payload_offset =
+uint64_t payload_offset =
 qcrypto_block_get_payload_offset(crypto->block) / 512;
+assert(payload_offset < (INT64_MAX / 512));
 
 qemu_iovec_init(&hd_qiov, qiov->niov);
 
@@ -524,7 +527,9 @@ static int64_t block_crypto_getlength(BlockDriverState *bs)
 BlockCrypto *crypto = bs->opaque;
 int64_t len = bdrv_getlength(bs->file->bs);
 
-ssize_t offset = qcrypto_block_get_payload_offset(crypto->block);
+uint64_t offset = qcrypto_block_get_payload_offset(crypto->block);
+assert(offset < INT64_MAX);
+assert(offset < len);
 
 len -= offset;
 
-- 
2.13.5




[Qemu-block] [PATCH v4 1/6] block: use 1 MB bounce buffers for crypto instead of 16KB

2017-09-27 Thread Daniel P. Berrange
Using 16KB bounce buffers creates a significant performance
penalty for I/O to encrypted volumes on storage which high
I/O latency (rotating rust & network drives), because it
triggers lots of fairly small I/O operations.

On tests with rotating rust, and cache=none|directsync,
write speed increased from 2MiB/s to 32MiB/s, on a par
with that achieved by the in-kernel luks driver. With
other cache modes the in-kernel driver is still notably
faster because it is able to report completion of the
I/O request before any encryption is done, while the
in-QEMU driver must encrypt the data before completion.

Signed-off-by: Daniel P. Berrange 
---
 block/crypto.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 58ef6f2f52..684cabeaf8 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -379,7 +379,11 @@ static void block_crypto_close(BlockDriverState *bs)
 }
 
 
-#define BLOCK_CRYPTO_MAX_SECTORS 32
+/*
+ * 1 MB bounce buffer gives good performance / memory tradeoff
+ * when using cache=none|directsync.
+ */
+#define BLOCK_CRYPTO_MAX_IO_SIZE (1024 * 1024)
 
 static coroutine_fn int
 block_crypto_co_readv(BlockDriverState *bs, int64_t sector_num,
@@ -396,12 +400,11 @@ block_crypto_co_readv(BlockDriverState *bs, int64_t 
sector_num,
 
 qemu_iovec_init(&hd_qiov, qiov->niov);
 
-/* Bounce buffer so we have a linear mem region for
- * entire sector. XXX optimize so we avoid bounce
- * buffer in case that qiov->niov == 1
+/* Bounce buffer because we don't wish to expose cipher text
+ * in qiov which points to guest memory.
  */
 cipher_data =
-qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
+qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_IO_SIZE,
   qiov->size));
 if (cipher_data == NULL) {
 ret = -ENOMEM;
@@ -411,8 +414,8 @@ block_crypto_co_readv(BlockDriverState *bs, int64_t 
sector_num,
 while (remaining_sectors) {
 cur_nr_sectors = remaining_sectors;
 
-if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
-cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
+if (cur_nr_sectors > (BLOCK_CRYPTO_MAX_IO_SIZE / 512)) {
+cur_nr_sectors = (BLOCK_CRYPTO_MAX_IO_SIZE / 512);
 }
 
 qemu_iovec_reset(&hd_qiov);
@@ -464,12 +467,11 @@ block_crypto_co_writev(BlockDriverState *bs, int64_t 
sector_num,
 
 qemu_iovec_init(&hd_qiov, qiov->niov);
 
-/* Bounce buffer so we have a linear mem region for
- * entire sector. XXX optimize so we avoid bounce
- * buffer in case that qiov->niov == 1
+/* Bounce buffer because we're not permitted to touch
+ * contents of qiov - it points to guest memory.
  */
 cipher_data =
-qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
+qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_IO_SIZE,
   qiov->size));
 if (cipher_data == NULL) {
 ret = -ENOMEM;
@@ -479,8 +481,8 @@ block_crypto_co_writev(BlockDriverState *bs, int64_t 
sector_num,
 while (remaining_sectors) {
 cur_nr_sectors = remaining_sectors;
 
-if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
-cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
+if (cur_nr_sectors > (BLOCK_CRYPTO_MAX_IO_SIZE / 512)) {
+cur_nr_sectors = (BLOCK_CRYPTO_MAX_IO_SIZE / 512);
 }
 
 qemu_iovec_to_buf(qiov, bytes_done,
-- 
2.13.5




[Qemu-block] [PATCH v4 0/6] Misc improvements to crypto block driver

2017-09-27 Thread Daniel P. Berrange
This is a followup to

  v1: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00781.html
  v2: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06464.html
  v3: https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg02923.html

This collection of patches first improves the performance of the
crypto block driver and then does various cleanups to improve ongoing
maint work.

Changed in v4:

  - Drop intermediate patch that replaced '512' with a constant (Max)
  - Use MIN() macro where needed (Max)
  - Fix bounce buffer size at 1MB instead of varying per sector size (Max)
  - Convert missing qcrypto_block_encrypt call to sectors in qcow.c (Max)

Changed in v3:

  - Support passthrough of BDRV_REQ_FUA (Eric)
  - Fix potential truncation of payload offset values (Eric)
  - Use encryption scheme sector size instead of BDRV_SECTOR_SIZE (Kevin)
  - Use QEMU_IS_ALIGNED where appropriate (Eric)
  - Remove unused 'sector_num' variable (Eric)
  - Fix whitespace alignment (Eric)
  - Fix math error in qcow conversion (Eric)

Daniel P. Berrange (6):
  block: use 1 MB bounce buffers for crypto instead of 16KB
  crypto: expose encryption sector size in APIs
  block: fix data type casting for crypto payload offset
  block: convert crypto driver to bdrv_co_preadv|pwritev
  block: convert qcrypto_block_encrypt|decrypt to take bytes offset
  block: support passthrough of BDRV_REQ_FUA in crypto driver

 block/crypto.c | 130 ++---
 block/qcow.c   |  11 +++--
 block/qcow2-cluster.c  |   8 ++-
 block/qcow2.c  |   4 +-
 crypto/block-luks.c|  18 ---
 crypto/block-qcow.c|  13 +++--
 crypto/block.c |  26 +++---
 crypto/blockpriv.h |   5 +-
 include/crypto/block.h |  29 ---
 9 files changed, 148 insertions(+), 96 deletions(-)

-- 
2.13.5




Re: [Qemu-block] [PATCH v3 5/7] block: convert crypto driver to bdrv_co_preadv|pwritev

2017-09-27 Thread Daniel P. Berrange
On Sat, Sep 16, 2017 at 06:54:58PM +0200, Max Reitz wrote:
> On 2017-09-12 13:28, Daniel P. Berrange wrote:
> > Make the crypto driver implement the bdrv_co_preadv|pwritev
> > callbacks, and also use bdrv_co_preadv|pwritev for I/O
> > with the protocol driver beneath.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  block/crypto.c | 104 
> > +++--
> >  1 file changed, 56 insertions(+), 48 deletions(-)
> 
> Reviewed-by: Max Reitz 
> 
> Some notes below.
> 
> > diff --git a/block/crypto.c b/block/crypto.c
> > index 49d6d4c058..d004e9cef4 100644
> > --- a/block/crypto.c
> > +++ b/block/crypto.c
> 
> [...]
> 
> > @@ -410,37 +414,33 @@ block_crypto_co_readv(BlockDriverState *bs, int64_t 
> > sector_num,
> >  goto cleanup;
> >  }
> >  
> > -while (remaining_sectors) {
> > -cur_nr_sectors = remaining_sectors;
> > +while (bytes) {
> > +cur_bytes = bytes;
> >  
> > -if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
> > -cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
> > +if (cur_bytes > (BLOCK_CRYPTO_MAX_SECTORS * sector_size)) {
> > +cur_bytes = BLOCK_CRYPTO_MAX_SECTORS * sector_size;
> 
> It's a bit weird that now the bounce buffer's size is now no longer
> fixed at 1 MB but variable depending on the crypto driver's block size.
> It also doesn't seem too intentional when looking at the first patch's
> commit message...
> 
> In any case, that would be an issue in the previous patch, though.  In
> general, I'm wondering whether maybe you should squash this patch into
> the previous one...  Yes, that would make the for a larger patch, but it
> wouldn't leave some not-quite-right state in between where sector_size
> is generally assumed to be equal to BDRV_SECTOR_SIZE -- which it is in
> practice, but not necessarily in theory.

In the end i'm going with this approach - just dropping the previous
patch entirely, since 99% of what it does is then removed in this
patch and the next one.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [PATCH 8/8] nbd: Minimal structured read for client

2017-09-27 Thread Vladimir Sementsov-Ogievskiy

27.09.2017 13:05, Vladimir Sementsov-Ogievskiy wrote:

26.09.2017 01:19, Eric Blake wrote:

On 09/25/2017 08:58 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal implementation: drop most of additional error information.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd-client.h  |   2 +
  include/block/nbd.h |  15 -
  block/nbd-client.c  |  97 +-
  nbd/client.c    | 169 
+++-

  4 files changed, 249 insertions(+), 34 deletions(-)

+++ b/include/block/nbd.h
@@ -57,11 +57,17 @@ struct NBDRequest {
  };
  typedef struct NBDRequest NBDRequest;
  -struct NBDReply {
+typedef struct NBDReply {
+    bool simple;
  uint64_t handle;
  uint32_t error;
-};
-typedef struct NBDReply NBDReply;
+
+    uint16_t flags;
+    uint16_t type;
+    uint32_t tail_length;
+    uint64_t offset;
+    uint32_t hole_size;
+} NBDReply;

This feels like it should be a discriminated union, rather than a struct
containing fields that are only sometimes valid...


No:

simple, handle and error are always valid
flags, type, tail_length are valid for all structured replies
offset and hole_size are valid for structured hole reply

so, we have one case when all fields are valid.. how do you see it 
with union, what is the real difference? I think it would be just a 
complication.





    #define NBD_SREP_TYPE_NONE 0
  #define NBD_SREP_TYPE_OFFSET_DATA   1
+#define NBD_SREP_TYPE_OFFSET_HOLE   2
  #define NBD_SREP_TYPE_ERROR NBD_SREP_ERR(1)
+#define NBD_SREP_TYPE_ERROR_OFFSET  NBD_SREP_ERR(2)

...especially since there is more than one type of SREP packet layout.

I also wonder why you are defining constants in a piecemeal fashion,
rather than all at once (even if your minimal server implementation
doesn't send a particular constant, there's no harm in defining them all
up front).


hmm. just to not define unused constants. It doesn't matter, I can 
define them all if you prefer.





+++ b/block/nbd-client.c
@@ -179,9 +179,10 @@ err:
  return rc;
  }
  -static int nbd_co_receive_reply(NBDClientSession *s,
-    uint64_t handle,
-    QEMUIOVector *qiov)
+static int nbd_co_receive_1_reply_or_chunk(NBDClientSession *s,

Long name, and unusual to mix in "1" instead of "one".  Would it be
better to name it nbd_co_receive_chunk, where we declare that a simple
reply is (roughly) the same amount of effort as a chunk in a structured
reply?


+ uint64_t handle,
+   bool *cont,
+   QEMUIOVector *qiov)
  {

No documentation of the function?


  int ret;
  int i = HANDLE_TO_INDEX(s, handle);
@@ -191,29 +192,95 @@ static int 
nbd_co_receive_reply(NBDClientSession *s,

  qemu_coroutine_yield();
  s->requests[i].receiving = false;
  if (!s->ioc || s->quit) {
-    ret = -EIO;
-    } else {
-    assert(s->reply.handle == handle);
-    ret = -s->reply.error;
-    if (qiov && s->reply.error == 0) {
+    *cont = false;
+    return -EIO;
+    }
+
+    assert(s->reply.handle == handle);
+    *cont = !(s->reply.simple || (s->reply.flags & 
NBD_SREP_FLAG_DONE));

We need to validate that the server is not sending us SREP chunks unless
we negotiated them.  I'm thinking it might be better to do it here
(maybe you did it somewhere else, but I haven't seen it yet; I'm
reviewing the patch in textual order rather than the order in which
things are called).


No, I didn't. Will add (may be early, in reply_entry).




+    ret = -s->reply.error;
+    if (ret < 0) {
+    goto out;
+    }
+
+    if (s->reply.simple) {
+    if (qiov) {
  if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
-  NULL) < 0) {
-    ret = -EIO;
-    s->quit = true;
+  NULL) < 0)
+    {
+    goto fatal;
  }
  }
+    goto out;
+    }
  -    /* Tell the read handler to read another header. */
-    s->reply.handle = 0;
+    /* here we deal with successful structured reply */
+    switch (s->reply.type) {
+    QEMUIOVector sub_qiov;
+    case NBD_SREP_TYPE_OFFSET_DATA:

This is putting a LOT of smarts directly into the receive routine.
Here's where I was previously wondering (and I think Paolo as well)
whether it might be better to split the efforts: the generic function
reads off the chunk information and any payload, but a per-command


Hmm. it was my idea to move all reading into one coroutine (in my 
refactoring series, but Paolo was against).


Or you mean to read a payload as raw? It would lead to double copying 
it to destination qiov, which I dislike..



callback function then parses the chunks.


per-command? Then callback for CMD_READ would have all of these 
"smarts", so the whole code would not be simpler.. (However it will 
simpl

Re: [Qemu-block] [PATCH 8/8] nbd: Minimal structured read for client

2017-09-27 Thread Vladimir Sementsov-Ogievskiy

26.09.2017 01:19, Eric Blake wrote:

On 09/25/2017 08:58 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal implementation: drop most of additional error information.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd-client.h  |   2 +
  include/block/nbd.h |  15 -
  block/nbd-client.c  |  97 +-
  nbd/client.c| 169 +++-
  4 files changed, 249 insertions(+), 34 deletions(-)

+++ b/include/block/nbd.h
@@ -57,11 +57,17 @@ struct NBDRequest {
  };
  typedef struct NBDRequest NBDRequest;
  
-struct NBDReply {

+typedef struct NBDReply {
+bool simple;
  uint64_t handle;
  uint32_t error;
-};
-typedef struct NBDReply NBDReply;
+
+uint16_t flags;
+uint16_t type;
+uint32_t tail_length;
+uint64_t offset;
+uint32_t hole_size;
+} NBDReply;

This feels like it should be a discriminated union, rather than a struct
containing fields that are only sometimes valid...


No:

simple, handle and error are always valid
flags, type, tail_length are valid for all structured replies
offset and hole_size are valid for structured hole reply

so, we have one case when all fields are valid.. how do you see it with 
union, what is the real difference? I think it would be just a complication.




  
  #define NBD_SREP_TYPE_NONE  0

  #define NBD_SREP_TYPE_OFFSET_DATA   1
+#define NBD_SREP_TYPE_OFFSET_HOLE   2
  #define NBD_SREP_TYPE_ERROR NBD_SREP_ERR(1)
+#define NBD_SREP_TYPE_ERROR_OFFSET  NBD_SREP_ERR(2)

...especially since there is more than one type of SREP packet layout.

I also wonder why you are defining constants in a piecemeal fashion,
rather than all at once (even if your minimal server implementation
doesn't send a particular constant, there's no harm in defining them all
up front).


hmm. just to not define unused constants. It doesn't matter, I can 
define them all if you prefer.





+++ b/block/nbd-client.c
@@ -179,9 +179,10 @@ err:
  return rc;
  }
  
-static int nbd_co_receive_reply(NBDClientSession *s,

-uint64_t handle,
-QEMUIOVector *qiov)
+static int nbd_co_receive_1_reply_or_chunk(NBDClientSession *s,

Long name, and unusual to mix in "1" instead of "one".  Would it be
better to name it nbd_co_receive_chunk, where we declare that a simple
reply is (roughly) the same amount of effort as a chunk in a structured
reply?


+   uint64_t handle,
+   bool *cont,
+   QEMUIOVector *qiov)
  {

No documentation of the function?


  int ret;
  int i = HANDLE_TO_INDEX(s, handle);
@@ -191,29 +192,95 @@ static int nbd_co_receive_reply(NBDClientSession *s,
  qemu_coroutine_yield();
  s->requests[i].receiving = false;
  if (!s->ioc || s->quit) {
-ret = -EIO;
-} else {
-assert(s->reply.handle == handle);
-ret = -s->reply.error;
-if (qiov && s->reply.error == 0) {
+*cont = false;
+return -EIO;
+}
+
+assert(s->reply.handle == handle);
+*cont = !(s->reply.simple || (s->reply.flags & NBD_SREP_FLAG_DONE));

We need to validate that the server is not sending us SREP chunks unless
we negotiated them.  I'm thinking it might be better to do it here
(maybe you did it somewhere else, but I haven't seen it yet; I'm
reviewing the patch in textual order rather than the order in which
things are called).


No, I didn't. Will add (may be early, in reply_entry).




+ret = -s->reply.error;
+if (ret < 0) {
+goto out;
+}
+
+if (s->reply.simple) {
+if (qiov) {
  if (qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
-  NULL) < 0) {
-ret = -EIO;
-s->quit = true;
+  NULL) < 0)
+{
+goto fatal;
  }
  }
+goto out;
+}
  
-/* Tell the read handler to read another header.  */

-s->reply.handle = 0;
+/* here we deal with successful structured reply */
+switch (s->reply.type) {
+QEMUIOVector sub_qiov;
+case NBD_SREP_TYPE_OFFSET_DATA:

This is putting a LOT of smarts directly into the receive routine.
Here's where I was previously wondering (and I think Paolo as well)
whether it might be better to split the efforts: the generic function
reads off the chunk information and any payload, but a per-command


Hmm. it was my idea to move all reading into one coroutine (in my 
refactoring series, but Paolo was against).


Or you mean to read a payload as raw? It would lead to double copying it 
to destination qiov, which I dislike..



callback function then parses the chunks.


per-command? Then callback for CMD_READ would have all of these 
"smarts", so the whole code would not be simpler.. (However it will 
simplif

Re: [Qemu-block] [Qemu-devel] [PATCH] virtio: introduce `info virtio' hmp command

2017-09-27 Thread Cornelia Huck
On Tue, 26 Sep 2017 19:20:20 +0200
Kevin Wolf  wrote:

> Am 26.09.2017 um 18:13 hat Jan Dakinevich geschrieben:
> > The command is intended for exposing device specific virtio feature bits
> > and their negotiation status. It is convenient and useful for debug
> > purpose.
> > 
> > Names of features are taken from a devices via get_feature_name() within
> > VirtioDeviceClass. If certain device doesn't implement it, the command
> > will print only hexadecimal value of feature mask.
> > 
> > Cc: Denis V. Lunev 
> > Signed-off-by: Jan Dakinevich 
> > ---
> > The patch suggests an another approach for this:
> > 
> > "virtio: show guest features in 'info qtree'"
> > http://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg01609.html  
> 
> Actually, I think the original approach is much nicer because it uses
> generic infrastructure rather than introducing a one-off monitor command
> that is specific to a certain class of devices.

Probably unsurprisingly, I like this approach much better.

> 
> But at the same time I can see Cornelia's point that this would create a
> whole lot of properties and if we did the same thing consistently for
> all devices, the output of 'info qtree' would be completely cluttered
> up.
> 
> Maybe it would make sense to have properties that can be queried with
> QOM, but that don't show up in 'info qtree'? If we then add some HMP
> 'qom-get' command that allows dumping a whole device as a QOM object
> instead of having to query property by property, it should be quite
> convenient to use.

It's not just that you have to add a property for every feature bit.
They would also need either more state (to cover host offered and guest
accepted bits), or they would need to be duplicated for host and guest
bits. Additionally, we need to consider that feature negotiation is only
final for virtio-1 devices if the feature accepted status has been set
-- that's why I think providing the current status makes sense as well.

I do like your idea of queryable properties and HMP-driven qom-get,
though.